-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Description
First of all, thanks for the wonderful project! colly has saved our team a lot of time!!
Context
According to RFC7578 section 4.3:
4.3. Multiple Files for One Form Field
The form data for a form field might include multiple files.
[RFC2388] suggested that multiple files for a single form field be transmitted using a nested "multipart/mixed" part. This usage is deprecated.To match widely deployed implementations, multiple files MUST be sent by supplying each file in a separate part but all with the same "name" parameter.
Receiving applications intended for wide applicability (e.g., multipart/form-data parsing libraries) SHOULD also support the older method of supplying multiple files.
and this practice is unsurprisingly common, and I am facing the exact same case.
The issue
The name field does not have to be unique. There are few common cases when a duplicated name field is required (e.g., when uploading an array of files), and this case should be properly covered.
Lines 551 to 559 in 99b7fb1
| // PostMultipart starts a collector job by creating a Multipart POST request | |
| // with raw binary data. PostMultipart also calls the previously provided callbacks | |
| func (c *Collector) PostMultipart(URL string, requestData map[string][]byte) error { | |
| boundary := randomBoundary() | |
| hdr := http.Header{} | |
| hdr.Set("Content-Type", "multipart/form-data; boundary="+boundary) | |
| hdr.Set("User-Agent", c.UserAgent) | |
| return c.scrape(URL, "POST", 1, createMultipartReader(boundary, requestData), nil, hdr, true) | |
| } |
Lines 1461 to 1469 in 99b7fb1
| buffer.WriteString("Content-type: multipart/form-data; boundary=" + boundary + "\n\n") | |
| for contentType, content := range data { | |
| buffer.WriteString(dashBoundary + "\n") | |
| buffer.WriteString("Content-Disposition: form-data; name=" + contentType + "\n") | |
| buffer.WriteString(fmt.Sprintf("Content-Length: %d \n\n", len(content))) | |
| buffer.Write(content) | |
| buffer.WriteString("\n") | |
| } | |
| buffer.WriteString(dashBoundary + "--\n\n") |
Unfortunately, the current implementaion accepts map[string][]byte, which enforces name to be unique.
Suggestion
Maybe we can accept []Subpart so that:
- The order of subparts is guaranteed
filenameand other metadata can be optionally included- Duplicate
namefields are allowed
and so on.
I would love to hear your opinion! If you think this is feasible, I will start working on it.