deeplake

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

MPL-2.0 License

Downloads
56K
Stars
7.8K
Committers
121

Bot releases are visible (Hide)

deeplake - v2.8.0 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • Release Candidate 0 for new experimental dataloader and queries (#1819) @AbhinavTuli
  • [AL-1946] Fix delete group + reset bug (#1843) @AbhinavTuli
  • [DL-652] Add append_empty arg to ds.append (#1846) @farizrahman4u
  • Avoid printing syncing labels message when no labels were added (#1845) @FayazRahman
  • [DL-684] Fix ds.reset bug with local datasets (#1842) @FayazRahman
  • Use staging visualizer in tests. Correct dev visualizer url. (#1838) @khustup
  • Changes default chunk id size to 8 bits from 4 bits to reduce possibility of collisions (#1835) @AbhinavTuli
  • wandb integration (#1739) @farizrahman4u

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @farizrahman4u and @khustup

deeplake - v2.7.5 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • [AL-1775] Point Cloud htype (#1685) @adolkhan
  • [AL-1912] Don't allow generic htypes with link (#1824) @AbhinavTuli
  • [Bugfix] Fixes rechunking with hub link + cloud paths (#1825) @AbhinavTuli
  • Enable progressbar for syncing labels (#1820) @FayazRahman
  • [Bug fix] Ensure None/"ENV" isn't added to used_creds_keys for linked data (#1823) @AbhinavTuli

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman and @adolkhan

deeplake - v2.7.4 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • Fix get_incompatible_dtype bug (#1814) @farizrahman4u
  • [AL-1888] Enable rechunking for text like htypes (#1815) @AbhinavTuli
  • [AL-1858] Treat empty list as None (#1813) @AbhinavTuli
  • Older reporting configurations were not properly handling username (#1806) @zomglings

βš™οΈ Who Contributes

@AbhinavTuli, @farizrahman4u and @zomglings

deeplake - v2.7.3 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • [AL-1884] Fixes bug with ds.reset for newly added/deleted tensors (#1797) @AbhinavTuli
  • [DL-618] Appending to class labels with text using multiple workers (#1794) @FayazRahman
  • [AL-1848] New agreements handling (#1796) @AbhinavTuli
  • [DL-590] S3: Always show retry warnings (#1807) @farizrahman4u
  • [DL-620] Prevent saving of dataset views for public datasets when user is not logged in (#1803) @farizrahman4u

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman and @farizrahman4u

deeplake - v2.7.2 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • [DL-593] Bugout correctly identifying the user's username when tokens are used (#1792) @adolkhan
  • Fix double indexing when saving strided views (#1793) @farizrahman4u

πŸš€ New

  • Gcp support for connected datasets (#1736) @ProgerDav

βš™οΈ Who Contributes

@ProgerDav, @adolkhan, @davidbuniat and @farizrahman4u

deeplake - v2.7.1 🌈

Published by github-actions[bot] about 2 years ago

🧭 What's Changed

  • [AL-1855] Adds support for JWT token while reading from http urls (#1780) @AbhinavTuli
  • Fix visualizer links for local views (#1791) @FayazRahman
  • Unhide auto and util submodules in docs (#1790) @FayazRahman
  • [AL-1813] Clarify whether a user is logged-in or not in authentication/permission errors (#1750) @adolkhan
  • Fix linked video playback test (#1789) @FayazRahman

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @adolkhan and @davidbuniat

deeplake - v2.7.0 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Disable locking for optimized views + new view urls (#1788) @farizrahman4u
  • Optimize data loading for fixed shape tensors (#1734) @AbhinavTuli
  • [AL-1809] refactor .data (#1769) @adolkhan
  • Fixes handling of empty tensors in pytorch (#1786) @AbhinavTuli
  • Make pbar on shuffle buffer filling readable (#1787) @farizrahman4u
  • Enable parallelization for view optimization (#1785) @farizrahman4u
  • Allow passing creds explicitly for external views (#1783) @farizrahman4u
  • [AL-1842] Update API reference for dataset views (#1781) @FayazRahman
  • [Bug] Fixes bug with links and compute (#1778) @AbhinavTuli
  • Adds ability to pop sample from dataset instead of popping from individual tensors (#1776) @AbhinavTuli
  • [AL-1811] - empty tensor error handling (#1741) @adolkhan
  • [AL-1816] Make fetching multiple images smarter (#1744) @adolkhan
  • [AL-1851][AL-1852][AL-1853][AL-1854] Allow plat to provide creds explicitly while saving views to user dir (#1775) @farizrahman4u
  • [Improvement] Retries in s3 get_object_from_full_url (#1777) @AbhinavTuli
  • [AL-1817][AL-1842] API reference for ViewEntry class (#1773) @FayazRahman
  • View fixes (#1774) @FayazRahman
  • [BUGFIX] Fix double indexing when fetching shapes of samples in views (#1772) @farizrahman4u
  • Fixes to linked tensors, partial reads (#1770) @AbhinavTuli
  • Allow platform to specify username for saving queries against public datasets (#1771) @farizrahman4u
  • [AL-1817][AL-1482] API reference updates (#1768) @FayazRahman
  • [AL-1807][AL-1844][AL-1845][AL-1849] Dataset view updates (#1764) @farizrahman4u
  • [API] Added example for changing credential management (#1766) @AbhinavTuli
  • [BUGFIX] Make linked video test work without google creds (#1765) @FayazRahman
  • Changed dev location to https://app-dev.activeloop.dev/ (#1767) @timfox456
  • [Bug fix] Fixes tensor meta is_link fast forwarding (#1763) @AbhinavTuli
  • [AL-1738] Adds ability to delete sample (#1673) @AbhinavTuli

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @adolkhan, @davidbuniat, @farizrahman4u and @timfox456

deeplake - v2.6.0 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Change default max chunk size (#1759) @farizrahman4u
  • Removes sequence test (#1762) @AbhinavTuli
  • [AL-1818] Adds ability to pad tensors in pytorch and compute (#1751) @AbhinavTuli
  • Enable random assignment by default (#1742) @farizrahman4u
  • deepcopy + s3 fixes (#1749) @farizrahman4u
  • [AL-1748] Show a progress bar while pytorch shuffle buffer is getting filled (#1745) @AbhinavTuli
  • Update class_labels with text (#1756) @FayazRahman
  • Update README.md (#1755) @istranic
  • Fixes a bug with Version control + Info modification (#1753) @AbhinavTuli
  • [AL-1780] [Small] Add support for pathlib.Path arguments (#1683) @adolkhan
  • [BUGFIX] Fix appending to nested class_label tensors with text (#1743) @FayazRahman
  • Skip google drive root test if no --gdrive flag (#1747) @FayazRahman
  • [AL-1781] Reduce duration of CI tests (#1713) @FayazRahman
  • Fix linked video tests (#1740) @FayazRahman
  • [AL-1785] Changes to linked tensor credential management + other fixes (#1726) @AbhinavTuli
  • Prevent changing logging configuration (#1738) @FayazRahman
  • [AL-1822] Register hub datasets created by deepcopy (#1732) @farizrahman4u
  • [AL-1821] ds.get_view (#1729) @farizrahman4u
  • [AL-1829] Support views from datasets with uneven tensors (#1737) @farizrahman4u
  • Google drive fixes (#1730) @FayazRahman
  • Support timestamps for linked videos (#1731) @FayazRahman
  • [AL-1820] view.sample_indices (#1728) @farizrahman4u
  • [Optimization] Make sample size calculation faster in pytorch (#1725) @AbhinavTuli
  • spell check - Found pytorch misspelt while reading docs (#1724) @neel2299
  • [AL-1806] Do partial reads to retrieve shape for samples (#1721) @AbhinavTuli
  • Fix linked video playback test (#1723) @FayazRahman
  • Rechunking fixes (#1701) @farizrahman4u
  • [AL-1810][AL-1814] Pytorch fixes (#1719) @AbhinavTuli
  • [AL-1808] Support playing linked videos (#1717) @FayazRahman
  • [BUG] Fix setitem issue with list of length 1 (#1718) @AbhinavTuli
  • [BUG] Fix tobytes for linked tensors (#1714) @FayazRahman
  • [AL-1795] Changes .data to return both numeric and text value for class labels (#1716) @AbhinavTuli
  • [AL-1787] Enable appending to class_label tensors with text (#1710) @FayazRahman
  • Update README.md (#1712) @istranic

πŸ› Bug Fixes

  • Correctly passing credentials to gcp provider (#1715) @ProgerDav

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @ProgerDav, @adolkhan, @davidbuniat, @farizrahman4u, @istranic, @mikayelh and @neel2299

deeplake - v2.5.2 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • [BUG] Fix empty sample partial read (#1709) @AbhinavTuli
  • [BUG] Prevent changing htype config when updating class_names (#1711) @FayazRahman
  • htype convenience class (#1698) @Diveafall
  • [AL-1792][AL-1793][AL-1804][AL-1805] Views bug fixes and api improvements (#1694) @farizrahman4u
  • [AL-1790] Adds ability to return indexes in pytorch (#1708) @AbhinavTuli
  • [AL-1800] Transforms should respect tiling threshold (#1707) @farizrahman4u
  • Docstring for timestamp (#1704) @FayazRahman

βš™οΈ Who Contributes

@AbhinavTuli, @Diveafall, @FayazRahman, @davidbuniat and @farizrahman4u

deeplake - v2.5.1 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • remove tk from tests (#1703) @davidbuniat
  • [AL-1786] Throw exception if user cannot get the read-only flag they specifically requested (#1688) @farizrahman4u
  • InvalidAccessKeyId fix (#1690) @farizrahman4u
  • [AL-1796] Timestamps for video (#1697) @FayazRahman
  • [AL-1791] Fix linked sample verification & empty sample behaviour with json, text, list and link htypes (#1693) @AbhinavTuli
  • API reference updates (#1696) @FayazRahman
  • [AL-1643] Adds exist ok for tensor and group creation (#1689) @AbhinavTuli
  • [small] Remove files created during test_rechunk (#1692) @dhiganthrao
  • Fix get_views() (#1691) @farizrahman4u
  • [AL-1720] Data lineage (#1660) @farizrahman4u
  • Update README.md (#1687) @istranic
  • Fixed colab check and error message. Changed the default frame of the… (#1684) @khustup

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @dhiganthrao, @farizrahman4u, @istranic and @khustup

deeplake - v2.5.0 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Fixed windows failing tests (#1682) @adolkhan
  • [AL-1751] Adds the ability to download dataset instead of streaming (#1668) @AbhinavTuli
  • [1780] All changes from branch issue_1601 (#1680) @adolkhan
  • Fix pytorch shuffle slowdown (#1679) @AbhinavTuli
  • Encoder fixes (#1676) @farizrahman4u
  • [AL-1764] Adds ability to partially read chunks (#1641) @AbhinavTuli
  • [AL-1645] Prevent printing visualization message in transforms (#1678) @FayazRahman
  • [AL-1763] [AL-1750] Fix shapes for sequence htype (#1640) @FayazRahman
  • update image_classification.py (#1650) @neel2299
  • Fix flush order (#1674) @farizrahman4u
  • Updated staging endpoint. (#1675) @khustup
  • [BUG] Image htypes fix for recompression (#1670) @FayazRahman
  • Fixes rechunking keyerror (#1672) @AbhinavTuli
  • Reduce arg combinations for query test (Make CI faster) (#1669) @farizrahman4u
  • Typo (#1667) @farizrahman4u
  • [AL-1745] .data() returns numpy arrays as list (#1665) @farizrahman4u
  • [AL-1721][AL-1722][AL-1728][AL-1729] Rechunking (#1637) @levongh

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @adolkhan, @davidbuniat, @farizrahman4u, @khustup, @levongh and @neel2299

deeplake - v2.4.2 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • [AL-1742] Partial update fix (#1652) @farizrahman4u
  • [AL-1784] Tiling bug fix (#1664) @farizrahman4u
  • [AL-1749] image.rgb and image.gray htypes (#1663) @farizrahman4u
  • [AL-1783] Reporting fixes (#1659) @farizrahman4u
  • [AL-1779] Adds get_creds method for link credentials (#1655) @AbhinavTuli
  • [AL-1765] Docstring for hub.tiled (#1661) @farizrahman4u
  • Revert "[AL-1749] image.rgb and image.gray htypes" (#1662) @farizrahman4u
  • [AL-1749] image.rgb and image.gray htypes (#1646) @FayazRahman
  • Fix tensor renames for ds.pytorch (#1658) @FayazRahman
  • Run hub tests in staging. (#1657) @khustup
  • Added points htype (#1656) @istranic
  • [#1178] Progress bar for .extend() (#1596) @dhiganthrao
  • [AL-1740] Append/Update empty samples using None (#1616) @AbhinavTuli
  • Add api reference for hub.link (#1649) @AbhinavTuli
  • [AL-1726] Metadata querying (#1620) @farizrahman4u
  • [AL-1756] Separate tiling threshold from min chunk size (#1633) @farizrahman4u
  • Fixes group deletion issues (#1645) @AbhinavTuli
  • [AL-1762] Fix multiple chunks case in queries (#1647) @AbhinavTuli

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @dhiganthrao, @farizrahman4u, @istranic and @khustup

deeplake - v2.4.1 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • [BUG] Delete sequence encoder when tensor is cleared (#1639) @FayazRahman
  • [AL-1766] hub.copy fix (2) + small refac (#1644) @farizrahman4u
  • Tensor meta fast forwarding + rename ds.copy creds arg (#1638) @farizrahman4u
  • [AL-1625] Rename groups (#1623) @FayazRahman

βš™οΈ Who Contributes

@FayazRahman, @davidbuniat and @farizrahman4u

deeplake - v2.4.0 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • [AL-1760][AL-1761] Htype fixes (#1634) @farizrahman4u
  • [AL-1754] Add reporting for missing features (#1635) @farizrahman4u
  • Update README.zh-cn.md (#1626) @tatevikh
  • [AL-1759] Fix google drive import error when importing hub (#1632) @farizrahman4u
  • [AL-1743] Improve DynamicTensorNumpyError message for sequence tensors (#1628) @farizrahman4u
  • Fixes linked tensors + random assignment (#1630) @AbhinavTuli
  • Google Drive storage provider (#1266) @FayazRahman

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @farizrahman4u and @tatevikh

deeplake - v2.3.5 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • docker update, rm miniaudio (#1627) @davidbuniat
  • [AL-1744] Sequence + transform fix (#1624) @farizrahman4u
  • Reduces warning on copying datasets (#1611) @AbhinavTuli
  • [AL-1752] Version control + Sequences fix (#1621) @farizrahman4u
  • [AL-1747] Exif fix (#1615) @farizrahman4u
  • [AL-1695] PyAV audio implementation (#1576) @FayazRahman
  • [AL-1741] Suppress loading messages on dataset deserialization (#1614) @FayazRahman
  • Next iteration for ds.visualize (#1605) @khustup
  • added Hub citation (#1618) @mikayelh
  • [AL-1715] Adds support for linking of external data (#1582) @AbhinavTuli
  • Video Playback (#1592) @farizrahman4u
  • [AL-1623] Rename tensors (#1514) @FayazRahman
  • Add google drive refresh token to workflow (#1606) @FayazRahman

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @farizrahman4u, @khustup and @mikayelh

deeplake - v2.3.4 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Retry backend requests on failure (#1604) @AbhinavTuli
  • Update README.md (#1593) @istranic
  • Support for additional image formats in hub.auto #1166 (#1531) @brlrb
  • Move pretty printing from str to summary() (#1587) @FayazRahman
  • Fix missing api reference links (#1579) @FayazRahman
  • [AL-1710] Allow branch creation when dataset is locked (#1584) @farizrahman4u
  • Changed str return to include Tensor-Wise information for issue #1439. (#1543) @neel2299
  • Set readonly=True by default on platform (#1583) @farizrahman4u
  • [AL-1724] tobytes for Pytorch & Tensorflow (#1580) @farizrahman4u
  • [AL-1713] Dicom (#1572) @farizrahman4u
  • Visualizer (#1567) @khustup
  • add a Mandarin version of README (#1577) @JinyiChenUofT

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @JinyiChenUofT, @brlrb, @davidbuniat, @farizrahman4u, @istranic, @khustup and @neel2299

deeplake - v2.3.3 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Update README.md (#1578) @tatevikh
  • API reference updates (#1575) @FayazRahman
  • [AL-1704] Better video error message (#1570) @FayazRahman
  • [AL-1719] Adds ability to reset head changes (#1571) @AbhinavTuli
  • added mention for 100+ datasets & fixes (#1568) @mikayelh
  • [AL-1698] Sample info (#1550) @farizrahman4u
  • [AL-1712] Dataset view copy (#1564) @farizrahman4u
  • Fixes shuffle + collate (#1552) @AbhinavTuli
  • tensor.clear() to delete all samples from tensor (#1288) @FayazRahman
  • Update dataset rename + api ref updates (#1566) @FayazRahman
  • [AL-1349] Add alias for checking if a dataset exists. (#1569) @FayazRahman
  • [Bug fix] Fixes boto3 credentials reload issue in hub compute (#1562) @AbhinavTuli
  • [AL-1716] Display message if newer version of hub is available (#1563) @AbhinavTuli
  • [AL-1718] - Do not throw exception if there is NameException in ds.filter (#1555) @levongh
  • [AL-1338] Added merge functionality (#1521) @AbhinavTuli
  • [API Docs] Fixes shuffle link (#1558) @AbhinavTuli
  • [AL-1356] Dataset Renaming (#1538) @FayazRahman

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @farizrahman4u, @levongh, @mikayelh and @tatevikh

deeplake - v2.3.2 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Quick fix for plat: remove is_sequence check (#1559) @farizrahman4u
  • AL-[1714] Add ResponseStreamingError to list of retry exceptions (#1540) @levongh
  • Update README.md (#1554) @davidbuniat
  • Revert "Updated formatting of features in readme" (#1551) @tatevikh
  • Updated formatting of features in readme (#1544) @istranic
  • [Tiny] Change dataset diff path (#1549) @AbhinavTuli
  • Disallow certain methods for dataset and tensor views (#1311) @FayazRahman
  • Fix TF requirement when installing hub (#1542) @farizrahman4u
  • AL-[1706] throw exception it commiting unchanged dataset (#1535) @levongh
  • AL-[1616] Tensorflow API Design (#1530) @levongh
  • [AL-1711] API documentation (#1533) @levongh
  • [AL-1697] Dataset copying (#1527) @FayazRahman
  • [AL-1691][AL-1707] Sequence htype (#1511) @farizrahman4u
  • Remove ffmpeg ack (#1529) @farizrahman4u
  • [AL-1708] Support appending/extending with dataset views (#1526) @farizrahman4u
  • [AL-1698 (partial)] Hidden tensors (#1525) @farizrahman4u
  • Pytorch timeout=30 (#1520) @farizrahman4u

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @farizrahman4u, @istranic, @levongh and @tatevikh

deeplake - v2.3.1 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • [Bug] Fix memory leak (#1523) @AbhinavTuli
  • [Bug fix] Removes repeated key misses for CommitChunkSet in some old datasets (#1518) @AbhinavTuli
  • Update README.md (#1498) @istranic
  • Check if a dataset exists or not (Issue #1182) (#1517) @dhiganthrao
  • BytePositionsEncoder fix (#1516) @farizrahman4u
  • [Bug Fix] Fixes an issue with info not being updated properly outside with context (#1515) @AbhinavTuli
  • Store user name in lock file (#1489) @farizrahman4u
  • [AL-1678] Adds ability to rechunk tensors (#1510) @AbhinavTuli
  • [Small bug fix] Pytorch + transform dict bug (#1513) @AbhinavTuli
  • [AL-1679] Partial upload (#1484) @farizrahman4u
  • Dataset view saving: small fix for #plat (#1508) @farizrahman4u
  • Fix python shutting down error message (#1509) @farizrahman4u
  • [AL-1677] Allow out of order data insertion (recommended for internal use only, will greatly impact performace if used incorrectly) (#1507) @AbhinavTuli

βš™οΈ Who Contributes

@AbhinavTuli, @davidbuniat, @dhiganthrao, @farizrahman4u, @istranic and @tatevikh

deeplake - v2.3.0 🌈

Published by github-actions[bot] over 2 years ago

🧭 What's Changed

  • Updating installation instructions (#1506) @istranic
  • [AL-1699][AL-1700] Fix indexing issues (#1502) @FayazRahman
  • Fix hub compute + filter (#1504) @AbhinavTuli
  • setup.py fix (#1503) @farizrahman4u

βš™οΈ Who Contributes

@AbhinavTuli, @FayazRahman, @davidbuniat, @farizrahman4u and @istranic