Azure Functions Blob Access Trace 2020

June 3, 2026 · View on GitHub

-- revision 1, 20210904

Introduction

This is a sample of the blob accesses in Microsoft's Azure Functions, collected between November 23rd and December 6th 2020. This dataset is the data described and analyzed in the SoCC 2021 paper 'Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications'.

Functions in Azure Functions are grouped into Applications. Included here is only data pertaining to a random sample of Azure Functions applications. The sampling is done per application, so that if there is data about an application in the trace, then all of its functions are included. The sampling rate is unspecified for confidentiality reasons.

The dataset comprises this description and a Jupyter Notebook with the plots in the SoCC paper.

Using the Data

License

The data is made available and licensed under a CC-BY Attribution License. By downloading it or using them, you agree to the terms of this license.

Attribution

If you use this data for a publication or project, please cite the accompanying paper:

Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, Ricardo Bianchini. "Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications", in Proceedings of the ACM Symposium on Cloud Computing 2021 (SoCC 21). ACM, Seattle, WA, 2021.

Lastly, if you have any questions, comments, or concerns, or if you would like to share tools for working with the traces, please contact us at azurepublicdataset@service.microsoft.com

Downloading

You can download the dataset here: https://github.com/Azure/AzurePublicDataset/releases/download/dataset-functions-blob-2020/azurefunctions_dataset2020_azurefunctions-accesses-2020.csv.bz2

Schema and Description

Schema

FieldDescription
TimestampAccess time in milliseconds since 1970
AnonRegionUnique id for the region1
AnonUserIdUnique id for the user1
AnonAppNameUnique id for the application1
AnonFunctionInvocationIdUnique id for the invocation1
AnonBlobNameUnique id for the blob accessed1
BlobTypeType of the blob accessed
AnonBlobETagVersion of the blob accessed1
BlobBytesNumber of bytes of the blob
ReadIf the access is a read
WriteIf the access is a write

Notes

  1. Ids are hashed using HMAC-SHA512 with secret salts and cropped.

Sample

TimestampAnonRegionAnonUserIdAnonAppNameAnonFunctionInvocationIdAnonBlobNameBlobTypeAnonBlobETagBlobBytesReadWrite
16060929001386ex7759203139gti3olh1565080819jfvf7k9kwiiq7gdxBlockBlob/application/octet-streamkq2su6bhi030.0TrueFalse
16069289031856ex12522442987c51my6n11918491411fjxqoqi2nc5njpgBlockBlob/application/zipibd6a5v5pv1938488.0TrueFalse
1606355700058iic1495523193uf2u84b01302383289tp783etybrgxap8xBlockBlob/6mreka6qhr36.0FalseTrue
1606924856178iic7051127781jgfqbn6186913326680lssrlkciitddx9BlockBlob/if8foq3a812204780.0FalseTrue
16066589579976ex125224429815dp5na61468781831juijw2ldiogyem3cBlockBlob/application/zip414fgngli4359512.0TrueFalse
.................................
1607270691764ayi1003538042766ofcie1080821259sfocyrxcksjgri5tBlockBlob/application/jsontanw2860j5164.0TrueFalse
1607270691884ayi1003538042766ofcie1530317863aat6cv8j2cofwj1aBlockBlob/application/jsongf05emgb6t164.0TrueFalse
1607270692007ayi1003538042766ofcie358892311u7p02pymm07pa7bgBlockBlob/application/jsonkl2uv31e7y164.0TrueFalse
1607270692134ayi1003538042766ofcie19789245079qeai70lggcku3c5BlockBlob/application/json3xa1dkrq7m164.0TrueFalse
1607270692284ayi1003538042766ofcie1142206120t8e88ksd6fiy2dx0BlockBlob/application/jsonbp4ynk65sl164.0TrueFalse

Validation

This data is the sample data used in the SoCC paper mentioned above. To verify the data, we reproduce the characterization graphs in the paper using the released trace in this Jupyter Notebook.