Write to Layer
Below examples show how to publish data to each supported layer type.
The functions write_partitions and write_stream accept iterators, to produce one value at a time. In case an Adapter is used, the type to pass to the write functions is adapter-specific.
In the examples below, actual data is represented by the placeholder ... while partition identifiers are plausible.
Note (when running within Jupyter environment)
When calling
write_partitionswith a payload of less than 50 MB, all data is published in a single transaction. When data sizes exceed 50 MB, a multipart upload is used to improve performance. The parallelization involved with multipart uploads interferes with Jupyter's internal asynchronous behavior if not explicitly handled. When uploading more than 50 MB to a single partition, include the following instructions in your notebook before you publish:import nest_asyncio import asyncio nest_asyncio.apply(loop=asyncio.get_event_loop())
Write to versioned layer
To write to one or more versioned layers, a Publication must be first created. Publications for versioned layers works like transactions. It's possible to complete a publication or cancel it to drop all the metadata uploaded until that moment.
Once a Publication is available, one or more write_partitions function calls can be used to write data to a layer. Each write function is layer specific. It's possible call each write function more than once for the same or multiple layers. See write_partitions for additional details.
Use set_partitions_metadata to update and delete metadata of partitions of a layer without uploading the content at the same time. The content has to be uploaded separately before. The function also provide a way to delete partitions as part of a publication.
Completing a publication involving one or more versioned layers creates a new version of the catalog. Please see init_publication for additional details.
Example: writing multiple layers and partitions with publication context manager
The following snippet creates a new catalog version in which partitions of multiple layers are added or modified. The content of each partition is encoded and uploaded. The transaction is committed when the with block terminates successfully which calls
Publication.complete() internally. If the with block terminates unsuccessfully, Publication.cancel() will be called internally.
Data is encoded according to the content type set in the layer configuration.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
with catalog.init_publication(layers=[layerA, layerB]) as publication:
layerA.write_partitions(publication, { "a1": ..., "a2": ..., "a3": ... })
layerB.write_partitions(publication, { 377893751: ..., 377893752: ... })
layerB.write_partitions(publication, { 377893753: ..., 377893754: ... })Note: If the content type of your layer is not supported, or if reading or writing raw content is preferred, please pass
encode=Falseto the write_* functions to skip encoding or decoding and deal with raw bytes instead. This applies e.g. to layer content typetext/plain. Example: writing multiple layers and partitions skipping encoding Users can provide already-encoded data in the form ofbytescompatible with the content type configured in each layer. In this case,encode=Falseshould be specified.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
publication = catalog.init_publication(layers=[layerA, layerB]) # ["A", "B"] also accepted
try:
layerA.write_partitions(publication, { "a1": bytes(...), "a2": bytes(...) }, encode=False)
layerB.write_partitions(publication, { 377893751: bytes(...), 377893752: bytes(...) }, encode=False)
publication.complete()
except:
publication.cancel()Write to volatile layer
To write to one or more volatile layers, a Publication must be first created. Publications for volatile layers should be closed when not needed anymore via the complete function to free resources. The cancel function, due to the nature of volatile layer, has no effect as the layer is not versioned, it doesn't support transactions and succeeded writes cannot be rolled back.
Once a Publication is available, one or more write_partitions function calls can be used to write data to a layer. Each write function is layer specific. It's possible call each write function more than once for the same or multiple layers. See write_partitions for additional details.
Use set_partitions_metadata to update and delete metadata of partitions of a layer without uploading the content at the same time. The content has to be uploaded separately before. The function also provide a way to delete partitions. Another way to delete partitions is delete_partitions.
Example: writing multiple layers and partitions with publication context manager
The following snippet write to two volatile layers. The content of each partition is encoded and uploaded. When the with block terminates successfully it will call Publication.complete() internally. If the with block terminates unsuccessfully, Publication.cancel() will be called internally.
Data is encoded according to the content type set in the layer configuration.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
with catalog.init_publication(layers=[layerA, layerB]) as publication:
layerA.write_partitions(publication, { "a1": ..., "a2": ..., "a3": ... })
layerB.write_partitions(publication, { 377893751: ..., 377893752: ... })Example: writing multiple layers and partitions skipping encoding
Users can provide already-encoded data in the form of bytes compatible with the content type configured in each layer. In this case, encode=False should be specified.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
publication = catalog.init_publication(layers=[layerA, layerB]) # ["A", "B"] also accepted
try:
layerA.write_partitions(publication, { "a1": bytes(...), "a2": bytes(...) }, encode=False)
layerB.write_partitions(publication, { 377893751: bytes(...), 377893752: bytes(...) }, encode=False)
finally:
publication.complete()Write to index layer
Writing to index layer does not need a publication. Writing to index layer is currently supported for one partition at the time.
Use the function write_single_partition to add a data and corresponding metadata to the index layer. Use the functino delete_partitions to remove partitions of the index layer that match a RSQL query.
It is also possible to operate with the index layer metadata only via the set_partitions_metadata. This adds and delete index partitions at once. The content for the added partitions has to be uploaded separately before.
Example: adding one partition to a layer Data is encoded according to the content type set in the layer configuration.
fields = {
"f1": 100,
"f2": 500
}
index_layer.write_single_partition(data=..., fields=fields)Example: adding one partition to a layer skipping encoding
Users can provide already-encoded data in the form of bytes compatible with the content type configured in each layer. In this case, encode=False should be specified.
fields = {
"f1": 100,
"f2": 500
}
index_layer.write_single_partition(data=bytes(...), fields=fields, encode=False)Write to stream layer
To write to one or more stream layers, a Publication must be first created. Publications for stream layers should be closed when not needed anymore via the complete function to free resources. The cancel function, due to the nature of stream layer, has no effect as it is not possible to delete messages from a Kafka stream once written.
Once a Publication is available, one or more write_stream function calls can be used to write data to a layer. Each write function is layer specific. It's possible call each write function more than once for the same or multiple layers. See write_stream for additional details.
Use append_stream_metadata to write to the stream just the metadata (messages) of a layer without uploading the content at the same time. The content has to be uploaded separately before, or included in the data field of the messages when small enough.
Example: writing to streams with context manager
The content of each partition is encoded and uploaded. When the with block terminates successfully which calls
Publication.complete() internally. If the with block terminates unsuccessfully, Publication.cancel() will be called internally.
Data is encoded according to the content type set in the layer configuration.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
with catalog.init_publication(layers=[layerA, layerB]) as publication:
layerA.write_stream(publication, { "a1": ..., "a2": ..., "a3": ... })
layerB.write_stream(publication, { 377893751: ..., 377893752: ... })
Example: writing to streams skipping encoding
Users can provide already-encoded data in the form of bytes compatible with the content type configured in each layer. In this case, encode=False should be specified.
layerA = catalog.get_layer("A")
layerB = catalog.get_layer("B")
publication = catalog.init_publication(layers=[layerA, layerB]) # ["A", "B"] also accepted
try:
layerA.write_stream(publication, { "a1": bytes(...), "a2": bytes(...) }, encode=False)
layerB.write_stream(publication, { 377893751: bytes(...), 377893752: bytes(...) }, encode=False)
finally:
publication.complete()Write to Interactive Map layer
This layer type does not have the concept of partitions and data. There are no functions that write raw data or support encode parameters. Interactive Map Layer API is modeled on the concept of GeoJSON FeatureCollection. Consult the API documentation for full details.
When using the default adapter, FeatureCollection or iterator of Feature (both GeoJSON concepts) are passed directly as parameters.
Example: writing GeoJSON features
from geojson import FeatureCollection, Feature, Point, Polygon
f1 = Polygon(coordinates=[(0, 0), (0, 1), (1, 0), (0, 0)], properties={"a": 100, "b": 200})
f2 = Point(coordinates=(-1.5, 2.32), properties={"a": 50, "b": 95})
features = FeatureCollection(features=[f1, f2])
interactive_map_layer.write_features(features)Example: writing GeoJSON features form a file
geojson_file_path = "~/example.geojson"
interactive_map_layer.write_features(from_file=geojson_file_path)Write to object store layer
Writing data to an Object Store layer requires that you define a key and provide the data to be associated with that key.
- Keys are strings composed of any of the following characters:
a-zA-Z0-9.[]=(){}/_-.` - Within a key, the
/(slash) character will be interpreted as a separator to define folder-like structures - If specified key already exists in the layer, the associated data will be overwritten with new
- The data to be associated with a key can be given as either a local file or a bytes object
When writing data, you may optionally specify the content type of the data being uploaded. If not specified, the type is assumed to be application/octet-stream.
Example: write contents of a file
layerA = catalog.get_layer("A")
layerA.write_object(key = "dir1/name_1", path_or_data = "localdatafile.txt", content_type = "text/plain")Example: write bytes object
layerA = catalog.get_layer("A")
layerA.write_object(key = "dir1/name_2", path_or_data = a_bytes_object )Example: delete an existing object
To remove an existing object from layer, use the delete_object method. This method provides an optional strict parameter which, if True, will raise an exception in case the key does not exist.
layerA = catalog.get_layer("A")
layerA.delete_object(key = "dir1/name_1")Example: copying an existing object
To copy an existing object from layer, use the copy_object method. This method provides an optional replace parameter which, if True, will replace the object in the target key.
layerA = catalog.get_layer("A")
layerA.copy_object(key = "dir1/name_2", copy_from = "dir1/name_1")
layerA.copy_object(key = "dir1/name_2", copy_from = "dir1/name_1", replace = True)Updated 3 days ago