Overview
You can configure the following properties when writing data to MongoDB in batch mode.
Note
If you use SparkConf
to set the connector's write configurations,
prefix spark.mongodb.write.
to each property.
Property name | Description | |
---|---|---|
| Required. The connection string configuration key. Default: mongodb://localhost:27017/ | |
| Required. The database name configuration. | |
| Required. The collection name configuration. | |
| The comment to append to the write operation. Comments appear in the
output of the Database Profiler. Default: None | |
| MongoClientFactory configuration key. You can specify a custom implementation that must implement the
com.mongodb.spark.sql.connector.connection.MongoClientFactory
interface.Default: com.mongodb.spark.sql.connector.connection.DefaultMongoClientFactory | |
| Specifies if the connector parses string values and converts extended JSON
into BSON. This setting accepts the following values:
Default: false | |
| Specifies a field or list of fields by which to split the collection data. To
specify more than one field, separate them using a comma as shown
in the following example:
Default: _id | |
| When true , the connector ignores any null values when writing,
including null values in arrays and nested documents.Default: false | |
| Specifies the maximum number of operations to batch in bulk
operations. Default: 512 | |
| Specifies the type of write operation to perform. You can set
this to one of the following values:
Default: replace | |
| Specifies whether to perform ordered bulk operations. Default: true | |
| When true , replace and update operations insert the data
if no match exists.For time series collections, you must set upsertDocument to
false .Default: true | |
| Specifies w , a write-concern option requesting acknowledgment that
the write operation has propagated to a specified number of MongoDB
nodes.For a list of allowed values for this option, see WriteConcern
w Option in the MongoDB Server
manual. Default: majority or 1 | |
| Specifies j , a write-concern option requesting acknowledgment that
the data has been written to the on-disk journal for the criteria
specified in the w option. You can specify either true or
false .For more information on j values, see WriteConcern j
Option in the MongoDB Server
manual. | |
| Specifies wTimeoutMS , a write-concern option to return an error
when a write operation exceeds the specified number of milliseconds. If you
use this optional setting, you must specify a nonnegative integer.For more information on wTimeoutMS values, see
WriteConcern wtimeout in
the MongoDB Server manual. |
Specifying Properties in connection.uri
If you use SparkConf to specify any of the previous settings, you can
either include them in the connection.uri
setting or list them individually.
The following code example shows how to specify the
database, collection, and convertJson
setting as part of the connection.uri
setting:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/myDB.myCollection?convertJson=any
To keep the connection.uri
shorter and make the settings easier to read, you can
specify them individually instead:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/ spark.mongodb.write.database=myDB spark.mongodb.write.collection=myCollection spark.mongodb.write.convertJson=any
Important
If you specify a setting in both the connection.uri
and on its own line,
the connection.uri
setting takes precedence.
For example, in the following configuration, the connection
database is foobar
:
spark.mongodb.write.connection.uri=mongodb://127.0.0.1/foobar spark.mongodb.write.database=bar