From 53e74f9f4bb302e012a629b3c5ed2fba7dbb97c9 Mon Sep 17 00:00:00 2001 From: Connor Sanders Date: Tue, 26 Aug 2025 16:15:24 -0500 Subject: [PATCH] Update README and added skippable_types.avro dataset --- data/avro/README.md | 13 +++++++------ data/avro/skippable_types.avro | Bin 0 -> 3234 bytes 2 files changed, 7 insertions(+), 6 deletions(-) create mode 100644 data/avro/skippable_types.avro diff --git a/data/avro/README.md b/data/avro/README.md index c0479a9..cdd7c69 100644 --- a/data/avro/README.md +++ b/data/avro/README.md @@ -38,9 +38,10 @@ for (fileStatus <- status) { Additional notes: -| File | Description | -|:--|:--| -| alltypes_nulls_plain.avro | Contains a single row with null values for each scalar data type, i.e, `{"string_col":null,"int_col":null,"bool_col":null,"bigint_col":null,"float_col":null,"double_col":null,"bytes_col":null}`. Generated from https://gist.github.com/nenorbot/5a92e24f8f3615488f75e2a18a105c76 | -| nested_records.avro | Contains two rows of nested record types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/nested-records.rs | -| simple_enum.avro | Contains four rows of enum types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/simple-enum.rs | -| simple_fixed | Contains two rows of fixed types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/simple-fixed.rs | \ No newline at end of file +| File | Description | +|:--------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| alltypes_nulls_plain.avro | Contains a single row with null values for each scalar data type, i.e, `{"string_col":null,"int_col":null,"bool_col":null,"bigint_col":null,"float_col":null,"double_col":null,"bytes_col":null}`. Generated from https://gist.github.com/nenorbot/5a92e24f8f3615488f75e2a18a105c76 | +| nested_records.avro | Contains two rows of nested record types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/nested-records.rs | +| simple_enum.avro | Contains four rows of enum types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/simple-enum.rs | +| simple_fixed | Contains two rows of fixed types. Generated from https://github.com/sarutak/avro-data-generator/blob/master/src/bin/simple-fixed.rs | +| skippable_types.avro | Contains four rows of all skippable types supported by arrow-avro. Generated from https://gist.github.com/jecsand838/82d9874a5f9be8a636dcd49ad9b8e237 | diff --git a/data/avro/skippable_types.avro b/data/avro/skippable_types.avro new file mode 100644 index 0000000000000000000000000000000000000000..b0518e0056b5ae3ecea7bb50e6a474fe676e9b82 GIT binary patch literal 3234 zcmb_eU1%It6ux^iS=O|qYOAk?g;F0nb!K*Uc2|VPG_eVYmYAXtmgUaOopj9X%rdjv zW)lKy#bWS5i!~?(eNhoXklKf$V1s=r2u1M0VDQ1d^hvEsD|ur9vtiHsg7e&V#ih6 z2T^U)uE#Z*Fsfxa_Cj0@q)|P?a>xKYUmVrcod!!gjv2_Kx@xmdy|}6`j%s>!*BlgS zAdl+mq#pO*Ef#RVc($AcQzI7C2TBM=+0_W&b+bwO0;pFf)!H2b2IPAWH$)x~vtce#Jy!lMwf=eUR6PX-q5vY#T!y-d zLbPUVQq3x{DuX+#3mvproZz-Sc&M?dHa*j^eK+8ewF~%^2D|*%*-Qqp21zNsNh$wJ zV2y^E#BO_Z;;z81ue+$-s3sH&j(h8L#IYRLPpDZ{9n0T$+1aT{5FMSJntBFaCZ0Y% zl`Z#i*>FjnQJ&x{))bR%8{c2ME2MZ8tJ zUi=!E&-#}Yki>RtIg(by+++8k9q-9`-;SC4Fj0H9eCrj*s07(3t;|@Zy3?LW-3%)% z@G$6VzBl?J1VC@F_&YqbH%w9{{SDLMx0}&_+h2Vp4rYJ_8O}&T7bdWL$H|UL`V&CU0IK6>TW^!WL++-oD3hsVh^GcyuN~f-mquyuz z;^Aq77E~=KsJyJvBKyiyUR4SjFEJ}3>A50R6qI6Cn0+qoLMV;7G~zJ9OIXECG&*X+ z9P%PUa~$Sek>gwmaonPa!5_CMA;k6ChXH^Lq13lqt*>X;%jMN8=k|Xxb72mljZ4f| z>(dX+NKMbVX{aI}}+C=6M#t&v3 z+iD`5S*EUonLpfXi_kV>V9(zn4J`7F!|kXyM(!UI#*dA9@E<@&J!bV5TzvK zHC=^(IuE{@VgWDV7ts5A+Zo7kp1zvGPrfmPCpI~Jgf$JpSjQ^D8s>J=C