arrow-julia

Official Julia implementation of Apache Arrow

OTHER License

Stars
285
Committers
42

Bot releases are visible (Hide)

arrow-julia - v1.4.0

Published by github-actions[bot] over 3 years ago

Arrow v1.4.0

Diff since v1.3.0

Closed issues:

  • reconsidering the current type registration/serialization mechanism (and its internal usage) (#88)
  • provide mechanism to free metadata stored in OBJ_METADATA? (#90)
  • Arrow.write slow perf with ZonedDateTime (#95)
  • Implement DataAPI pool/dict encoding methods for DictEncoded (#120)
  • Slower materialization Feather vs Arrow (#131)
  • Usage with MPI (#151)
  • Reading CSV (#157)
  • Reading an Arrow file with no message batches after the schema seems to produce a partly initialized Table? (#158)
  • DictEncoded methods for refpool, refarray and levels (#159)
  • MethodError Int64(::Arrow.Timestamp... when reading arrow file saved by pandas. (#166)
  • Improve printing? (#168)

Merged pull requests:

  • Add refpool, refarray and levels for DictEncoded (#161) (@dmbates)
  • Tweak promoteunion to always avoid abstract types (#162) (@quinnj)
  • Restructure ArrowTypes so it can be registered as its own package (#163) (@quinnj)
  • DataAPI methods (#164) (@quinnj)
  • Don't store table metadata globally (#165) (@quinnj)
  • document guarantee that getmetadata returns alias not copy (#169) (@jrevels)
  • add missing setmedata! method for Arrow.Table (#170) (@jrevels)
  • use actual deprecation for registertype! (#171) (@ericphanson)
  • Warn when converting Arrow.Timestamps to Dates.DateTime or ZonedDateTime (#172) (@quinnj)
  • Introduce Arrow.ToTimestamp for performant ZonedDateTime encoding (#173) (@quinnj)
  • Fix () -> {} typo in docs (#174) (@etpinard)
  • Fix case when ipc stream has no record batches, only schema (#175) (@quinnj)
  • Fix slight perf hit when checking validity bitmap (#176) (@quinnj)
arrow-julia - v1.3.0

Published by github-actions[bot] over 3 years ago

Arrow v1.3.0

Diff since v1.2.4

Closed issues:

  • Attempting to serialize DataTypes induces segfault (#74)
  • tables containing Set values are serializable but corresponding deserialized Arrow.Tables are inaccessible (#75)
  • support for heterogeneously typed tuples (#85)
  • Difficult to read the code (#91)
  • Arrow.write hangs on Tables.partitioner (#108)
  • Unsafe conversion to signed integer types (#121)
  • Arrow.write in v1.4.2 can create an invalid arrow file (#126)
  • Arrow dataset imported as DataFrames are not pure DataFrames? (#127)
  • Arrow.jl issue with struct types (#128)
  • unsupported ARROW:extension:name type: "JuliaLang.Nothing" (#132)
  • Loss of parametric type information for custom types (#134)
  • Avoid assuming field values can be used in constructors (#135)
  • Help (#137)
  • Losing type in unnamed column (#138)
  • How to handle parametric Unitful types (#139)
  • Can't serialize structs that contain ::Type{T} fields (#140)
  • Cannot iterate Arrow.Stream (#141)
  • Arrow.write("my.arrow", CategoricalArray([1,2,3])) hangs (#143)
  • Arrow Table conversion to DataFrame throws DimensionMismatch Error (#144)
  • copying Arrow.Table does not always copy columns (#146)
  • Hang with multithreaded reading (#155)

Merged pull requests:

  • Add ntasks keyword to limit # of tasks allowed to write at a time (#106) (@quinnj)
  • Fix typo (#130) (@Sov-trotter)
  • implement Base.IteratorSize for Stream, fixes #141 (#142) (@damiendr)
  • Introduce new maxdepth keyword argument for setting a limit on nesting (#147) (@quinnj)
  • Ensure dict encoded index types match from record batch to record batch (#148) (@quinnj)
  • Ensure serializing Arrow.DictEncoded writes dictionary messages (#149) (@quinnj)
  • revert setting Arrow.write debug message threshold to -1 (#152) (@jrevels)
  • add unexported tobuffer utility for interactive testing/development (#153) (@jrevels)
  • Better handle errors when something goes wrong writing partitions (#154) (@quinnj)
  • Overhaul type serialization/deserialization machinery (#156) (@quinnj)
arrow-julia - v1.2.4

Published by github-actions[bot] over 3 years ago

Arrow v1.2.4

Diff since v1.2.3

Merged pull requests:

  • fix accidental invocation of _unsafe_load_tuple (#124) (@jrevels)
arrow-julia - v1.2.3

Published by github-actions[bot] over 3 years ago

Arrow v1.2.3

Diff since v1.2.2

Merged pull requests:

  • Use pool length in signed int conversion (#122) (@dmbates)
arrow-julia - v1.2.2

Published by github-actions[bot] over 3 years ago

Arrow v1.2.2

Diff since v1.2.1

Closed issues:

  • Segmentation Fault with Threads.@spawn + Tables.partitioner + write with compression (#82)
  • Types deserialize differently during session in which they were written (#88)
  • Producing unsigned dict encoding indices; should be signed (#112)
  • Unsigned integers as indices in DictEncoded type (#113)
  • DictEncoded doesn't write as DictEncoded (#116)
  • Errors writing file with missing in categorical (#117)

Merged pull requests:

  • Make compressed writing threadsafe (#118) (@quinnj)
  • Rework dict encoding of PooledArray/CategoricalArray to fix outstandi… (#119) (@quinnj)
arrow-julia - v1.2.1

Published by github-actions[bot] over 3 years ago

Arrow v1.2.1

Diff since v1.2.0

Closed issues:

  • Error constructing a DataFrame with a dict-encoded column (#102)
  • Why does unpacking a DictEncoding insert a ChainedVector layer? (#109)

Merged pull requests:

  • Don't use ChainedVector as DictEncoding data array unless necessary (#110) (@quinnj)
  • Fix copy on DictEncode (#111) (@quinnj)
arrow-julia - v1.2.0

Published by github-actions[bot] over 3 years ago

Arrow v1.2.0

Diff since v1.1.0

Closed issues:

  • Error with DatePart('Z') (#81)
  • Cannot copy a DataFrame containing a DictEncoded field with a missing value (#101)

Merged pull requests:

  • change UUID <-> Arrow mapping to (de)serialize to/from 16-byte FixedSizeBinary (#103) (@jrevels)
  • add isbitstype optimized path for FixedSizeList getindex (#104) (@jrevels)
  • bump Project.toml to v1.2.0 (#107) (@jrevels)
arrow-julia - v1.1.0

Published by github-actions[bot] almost 4 years ago

Arrow v1.1.0

Diff since v1.0.3

Closed issues:

  • memory leaking when reading compressed arrow files (#80)
  • writing column with missing / struct data errors (#84)
  • downstream packages need to put Arrow.ArrowTypes.registertype! statements in __init__ (#87)

Merged pull requests:

  • Support new Decimal256 type (#79) (@quinnj)
  • fix typo (#83) (@ericphanson)
  • add ArrowTypes.default methods and tests for dates (#86) (@ericphanson)
  • add default UUID <-> UInt128 Arrow type mapping (#89) (@jrevels)
  • bump Project.toml to v1.1.0 (#94) (@jrevels)
  • Add warning for Arrow.ArrowTypes.registertype! (#96) (@ericphanson)
  • Fix deploydocs (#97) (@ericphanson)
  • convert Arrow-flavored eltypes to Julia-flavored eltypes on copy (#98) (@jrevels)
  • Fix copy on DictEncoding arrays with missing values (#99) (@quinnj)
  • Add BitIntegers compat (#100) (@quinnj)
arrow-julia - v1.0.3

Published by github-actions[bot] almost 4 years ago

Arrow v1.0.3

Diff since v1.0.2

Closed issues:

  • NamedTuple{...,Union{...}} values are serializable but inaccessible once deserialized (#76)

Merged pull requests:

  • Fix Union type deserialization (#77) (@quinnj)
arrow-julia - v1.0.2

Published by github-actions[bot] almost 4 years ago

Arrow v1.0.2

Diff since v1.0.1

Closed issues:

  • is not a valid arrow file (#71)
  • factor read as string (#72)

Merged pull requests:

  • Finish support for automatic custom struct deserialization (#73) (@quinnj)
arrow-julia - v1.0.1

Published by github-actions[bot] almost 4 years ago

Arrow v1.0.1

Diff since v1.0.0

Closed issues:

  • Wrong handling of missing with Char (#68)

Merged pull requests:

  • Check field nullability for custom extension types (#69) (@quinnj)
arrow-julia - v1.0.0

Published by github-actions[bot] almost 4 years ago

Arrow v1.0.0

Diff since v0.4.1

Closed issues:

  • Arrow conversion type for naive Julia DateTime - Arrow.Date or Arrow.Timestamp? (#58)
  • No error when writing NamedTuple "table" with different-length columns (#60)
  • Releasing the lock on arrow file (#61)

Merged pull requests:

  • get documentation going (#62) (@ExpandingMan)
  • fixed docs link (#64) (@ExpandingMan)
  • Add validity check for columns with different lengths; fixes #60 (#65) (@quinnj)
  • Auto-convert DateTime to arrow Timestamp instead of millisecond Date;… (#66) (@quinnj)
arrow-julia - v0.4.1

Published by julia-tagbot[bot] almost 4 years ago

arrow-julia - v0.4.0

Published by julia-tagbot[bot] almost 4 years ago

arrow-julia - v0.3.0

Published by julia-tagbot[bot] about 4 years ago