Support for VARBINARY(MAX) field type in ADS
Hello I have a user case that requires migrating Salesforce objects that contain attachments such as HTML Body, PDF's, Word, Excel and jpg's attachments to another instance of Salesforce. These are normally stored as a VARBINARY(MAX) in a MS SQL database and I think would be a Blob in SQL Lite. Given these data types are not supported in Aperture Data Studio ADS has anyone heard of a workaround using the SDK, custom exporters, pushdown or other hacks so that the data is stored in underlying SQL Lite DB and can be exported to a target without having to be part of a dataset. No processing, transformations or sorting are required on the field. Thanks.
Answers
-
-
Data Studio doesn't currently have a "BLOB" type, so any data imported from this type of field is treated as ALPHANUMERIC.
Data Studio's maximum value size is 32k bytes and most attachments will be larger than this. As a result, when loading attachment contents from the
VersionData
field in theContentVersion
table, we see the max value size at play for the BLOB data itself (VersionData field), which is truncated at 32,766 bytes (the reported ContentSize in SF is 164,397 for the highlighted row):Much smaller attachments (rows 1,3,4) can be loaded successfully.
In future there may be an option to increase the maximum size, or do some compression, for certain values where we don't intent that any operation would be performed on the value - In this case you'd just want to pass the LOB right through from source to target. However this is not currently possible.
Henry
Appendix - Notes on how Salesforce stores attachments:
Enhanced Files Model: Salesforce has gradually moved from the
Attachment
model to a newer, more flexible "Files" model. This means that attachments will now be stored in theContentVersion
andContentDocument
objects instead of theAttachment
object. To fetch files, you'd typically query these tables:ContentDocument
: Represents a document that has been uploaded to a library- ContentDocument is the parent object of both ContentVersion and ContentDocumentLink
- ContentDocument records can’t be inserted or updated directly - The document is created by Salesforce when a ContentVersion record is inserted and its ContentDocumentId field is null
ContentVersion
: This represents a specific version of the document.- For
ContentVersion
, you can retrieve the file content through theVersionData
column, which is also in BLOB format.
- For
ContentDocumentLink
: This object links (describes the relationship between) the document and the parent object (e.g., Group, Case, Opportunity, etc.). Represents where it is shared- A document can be (usually is) shared with multiple objects, eg a Case and a User
- You can't run a query without filters against ContentDocumentLink.
- A SOQL query must filter on one of Id, ContentDocumentId (get all linked objects for a doc), or LinkedEntityId (get all linked docs for a given object)
- This may represent a challenge for Data Studio, as it makes it difficult to associate objects with Content Documents in bulk
Selected references and sources:
-
-
-
-
1