airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows

APACHE-2.0 License

Downloads
57.5M
Stars
36.1K
Committers
3.2K

Bot releases are hidden (Show)

airflow - 1.7.0

Published by mistercrunch over 8 years ago

Raw changelog

for the record, generated with:
~/node_modules/github-changes/bin/index.js -o airbnb -r airflow --only-pulls --use-commit-body --between-tags 1.6.2...1.7.0rc1 --token {your_gh_token_here}

1.7.0

  • #1147 Enhance CLI Test command to accept a JSON-formatted dictionary of par… (@r39132)
  • #975 statuses column on /admin shows only active or most recent dag_runs (@jtschoonhoven)
  • #1127 FAQ entry about start_date (@mistercrunch)
  • #1129 Showing active dag runs as in (3/16) in tooltip (@mistercrunch)
  • #1144 More explicit zindex on modals (@mistercrunch)
  • #1143 small fixes to previous bq project inclusion pr. (@mtagle)
  • #1139 Allow specificiation of project in BigQuery Hook methods (@mtagle)
  • #1110 Add date support to MySQL to GCS operator (@criccomini)
  • #1135 Added start_date initialization for DagRun creation within schedule_dag(self, dag_id) (@RvN76)
  • #1140 License check (@bolkedebruin)
  • #1138 Add support for BigQuery User Defined Functions in BigQuery operator (@LeBlanc)
  • #1132 Add custom email backends. (@jmcarp)
  • #1119 Add gcloud-based GCSHook (@jlowin)
  • #1134 Parameterizing DagBag import timeouts (@mistercrunch)
  • #1045 Add startup scripts for upstart based systems (@d-lee)
  • #1131 Adding ssh connection type to webform (@hyperborea)
  • #792 add error handling for slack api (@abridgett)
  • #1115 This patch adds license checking for Airflow. (@bolkedebruin)
  • #1116 Add WePay and committer list to README.md (@criccomini)
  • #1118 Don't force installation of GCP API client dependencies (@jlowin)
  • #1090 Adding template support in qbol operator (@msumit)
  • #1108 new company + link to pitfalls (@kretes)
  • #1104 Add INT24 (MEDIUMINT) support to MySQL to Google cloud storage operator (@criccomini)
  • #1076 SubDAG docs and examples (@nicktrav)
  • #1101 modify datastore hook so that authorization is maintained for the lif… (@mtagle)
  • #1099 Adding @mention in Airbnb and links to airbnb open source projects (@artwr)
  • #1096 Make SqlAlchemy pool_recycle and pool_size configurable (@amread)
  • #1070 Proper sqlalchemy syntax for desc (@mistercrunch)
  • #1079 SID Oracle DB connection support (@biln)
  • #1094 Adding to inits (@artwr)
  • #1093 Add MySQL to BQ support for TINYINT (@criccomini)
  • #1088 Fix MySQL to Google cloud storage scoping. (@criccomini)
  • #1075 Pig hook and operator stub (@artwr)
  • #1084 Replace deprecated flask.ext.* with flask_* (@jeffwidman)
  • #1082 Cleanup Contributing.md (@jeffwidman)
  • #1086 Default to 0 if no rows loaded in GCS to BQ operator. (@criccomini)
  • #1080 Add MySQL->GCS, GCS->BQ operators (@criccomini)
  • #1073 Fixing small issue with qbol operator and hook (@msumit)
  • #1068 Add a new hook for google datastore (@mtagle)
  • #1065 dag pausing should pause queued tasks as well (@mistercrunch)
  • #1067 Add two methods to BigQueryBaseCursor: (@mtagle)
  • #1064 Add Thumbtack to organizations using Airflow (@natekupp)
  • #1063 Documenting task details doc_ feature (@mistercrunch)
  • #1049 Allow the use of the autoconfig client (@bolkedebruin)
  • #1056 Add output encoding option to BashOperator (@Attsun1031)
  • #996 Added the dst filename to template_fields GoogleCloudStorageDownloadOperator (@0xR)
  • #1051 Running unit tests with local executor (@mistercrunch)
  • #1008 Add FTPSHook class. (@geeknam)
  • #1055 Improve [Hide/Show all series] perforamce (@lumengxi)
  • #1054 CLI's trigger_dag now accepts --conf as json (@mistercrunch)
  • #1053 Add Kogan.com to the list of users (@geeknam)
  • #1046 Documenting the cluster policy feature (@mistercrunch)
  • #1044 Updating the Readme with a link to the TriggerDagRunOperator post (@r39132)
  • #1043 Adding an example to illustrate the TriggerDagRunOperator (@r39132)
  • #1039 Only set headers and delimiters for CSV exports in Google BigQuery hook (@criccomini)
  • #1026 Minor documentation tweaks to the FAQ under the fernet key section (@r39132)
  • #1025 Add notes on connection password encryption (@d-lee)
  • #1018 add SSL support for SMTP (@bbrumi)
  • #1020 Add BigQuery copy operator. (@criccomini)
  • #964 Clean test database out in between unit test runs (@nicktrav)
  • #1019 Add direct dependencies for Google cloud contribs (@criccomini)
  • #985 [airflow][presto] Gracefully handle 503 errors and avoid eval() (@airbnb)
  • #1013 Fixed issue 1012: pool not used with celery executor (@rdavison)
  • #1011 drop the tmp table after ingestion (@dayzzz)
  • #995 Fix LDAP error messages when login fails. (@criccomini)
  • #1006 Fix forgetting to expose yesterday_ds_nodash and tomorrow_ds_nodash (@0xR)
  • #1003 Update to the README such as adding Hootsuite (@r39132)
  • #998 Pedantic documentation tweaks (@joshmarlow)
  • #940 Refresh border coloring in graph view without having to refresh the entire page (@DinoCow)
  • #987 Add logout button to Airflow (@criccomini)
  • #930 When either data_profiler_filter or superuser_filter aren't defined,… (@mtagle)
  • #994 there could be dot in the key string, which is illegal in the hive ta… (@dayzzz)
  • #993 Added yesterday_ds_nodash and tommorow_ds_nodash (@0xR)
  • #992 Rename BashOperator.xcom_push member to avoid collision with parent class (@seregasheypak)
  • #978 Fixing conflicting params in default_args (@mistercrunch, @nicktrav)
  • #984 Add tests for params handling in Dag construction (@nicktrav)
  • #986 [documentation] document that max_active_runs can prevent a DAG from running (@aoen)
  • #983 Added destination_dataset_table to template_fieds of bigquery_operator (@0xR)
  • #981 Add BigQuery PEP 249 support (@criccomini)
  • #980 Support user-defined macros and params in dry-run backfills with task… (@r39132)
  • #976 More verbose logging for are_dependencies_met when called from run (@mistercrunch)
  • #944 Fix statuses in dag list for dags with dag_ids prefixed by a number (@wil5for)
  • #942 Set effective user of (web)hdfs hooks using connection config, optionally overridden by constructor (@xadhoom)
  • #965 Adding mock lib to devel extras_require (@mistercrunch)
  • #963 Only schedule DagRuns between start and end dates (@nicktrav)
  • #948 Fix encryption alert msg for env var, configuration->conf refactor (@mistercrunch)
  • #960 Fix parsing file that contains multi byte char (@Attsun1031)
  • #949 Allow for domain-wide delegation in google cloud apps (@mtagle)
  • #958 Fix bug in reporting of attempt number when queuing tasks in a pool (@r39132)
  • #946 Add dag_state to cli (@wil5for)
  • #956 Support new configuration dags_are_paused_at_creation that, when True… (@r39132)
  • #945 Add documentation to gcs_download_operator (@criccomini)
  • #947 Encrypt Variables if Fernet key provided (@r39132)
  • #943 Fix running custom pool task with mark-success is not marked success (@Attsun1031)
  • #816 Implement SSH Execute Operator (@KeenS)
  • #941 Fix process_subdir bug (@Attsun1031)
  • #939 Show rendered templates from the CLI (@mistercrunch)
  • #224 PrestoToMySqlTransfer (@mistercrunch)
  • #801 allow slack attachments to be templated (@abridgett)
  • #938 [hotfix] dag missing from dagbag (@mistercrunch)
  • #862 handle default option for extra_options argument in HttpHook.run method (@JoergRittinger)
  • #935 Fix and refactor the httphook (@mistercrunch)
  • #919 Add support for three-legged OAuth for Google connections. Useful for… (@criccomini)
  • #937 Statsd abstraction (@mistercrunch)
  • #743 Get connection string containing username and password during runtime (@praveev)
  • #926 Normalize plugin paths that include both slashes and dots (@jasonjho)
  • #934 Support for encrypting the connection extra field (@r39132)
  • #928 fix function name error: active_tasks (@flying5)
  • #882 Docker operator (@asnir)
  • #925 Small change to fix a crash when reading log files from remote workers (@kmevissen)
  • #868 Encrypt logs (@vansivallab)
  • #927 [fix] disregarding adhoc tasks when closing dag runs (@mistercrunch)
  • #918 Reverting production issues from 876 and undead (@mistercrunch)
  • #922 Add link to SmartNews in README (@takus)
  • #875 SLA Miss Alert Callbacks : Allow DAGs to specify a callback function for SLA miss handling (@r39132)
  • #917 Allowing for relative path and dot notation for -sd (@mistercrunch)
  • #901 Make Google Cloud Storage download operator use a filename, not a fil… (@criccomini)
  • #912 Add Clover Health to Airflow users (@vansivallab)
  • #813 Add "search_scope" as a configuration variable for LDAP (@NeilHanlon)
  • #910 Add similarweb airflow puppet module to the readme (@danielbenzvi)
  • #909 Fixes typo (@rosner)
  • #907 Call Session.remove after each run, to survive DB restart (@KMK-ONLINE)
  • #906 Update some datetime column default args for consistent treatment (@t1m0thy)
  • #904 Bugix for TriggerDagRunOperator (@t1m0thy)
  • #905 Add SimilarWeb to the who uses airflow section (@danielbenzvi)
  • #903 Fixing the tutorial. Removing an unnecessary import of MySqlOperator … (@cesararevalo)
  • #797 Add super user and profiler to ldap (@criccomini)
  • #898 Making force a task instance member, so it becomes available for (@iddoav)
  • #896 Cherry picked commits out of recent rollback (@mistercrunch)
  • #894 [hotfix] fixing infinite retries in prod (@mistercrunch)
  • #893 Fix ISSUE 798: password printed to STDOUT when running initdb, resetdb, upgradedb (@DinoCow)
  • #889 Merging airbnb_prod hotfixes into master (@mistercrunch)
  • #887 Killing tasks that aren't in a running state (@mistercrunch)
  • #888 Airbnb prod (@mistercrunch)
  • #808 Adding support for QDS (Qubole Date Services) (@msumit)
  • #876 Airbnb prod (@iivvaall)
  • #884 Fixing issue where try_number isn't incremented (@mistercrunch)
  • #878 state wasn't being saved (@abridgett)
  • #879 Readme update (@thibault-ketterer)
  • #835 Issue 832: creating run_id if not given while executing trigger_dag (@msumit)
  • #867 Airbnb prod (@mistercrunch)
  • #866 [hotfix] fixing subdag not refreshing properly (@mistercrunch)
  • #865 Bigquery and Google Cloud Storage operators (@criccomini)
  • #861 Bigquery operator (@criccomini)
  • #779 Adding base date and run number form to Task Duration and Landing Times views (@MaximeKestemont)
  • #854 [hotfix] subdag not showing up (@mistercrunch)
  • #853 Fixing a bug where some dags can't be retrived from DagBag.get_dag (@mistercrunch)
  • #848 Delete dags from dagbag (@mistercrunch)
  • #851 add a parameter for number of shard in batch ingestion (@dayzzz)
  • #850 Update models.py to increase password field length (@rahul342)
  • #763 Added title attribute for logs (@JordyMoos)
  • #817 Fix ISSUE-812 (@bolkedebruin)
  • #840 Fixing SLA handling related bug (@mistercrunch)
  • #841 Add a Gitter chat badge to README.md (@gitter-badger)
  • #839 Adding visibility as to which dag is pickleable (@mistercrunch)
  • #826 Feature/cli fixes (@abridgett)
  • #830 Add Sidecar Interactive to list of companies using Airflow (@robottokauf3)
  • #825 color alternate rows so it's easier to use (@abridgett)
  • #802 Implemented GitHub Enterprise Authentication (@mtp401)
  • #831 order dag run drop down in graph view (@abridgett)
  • #828 Fix invalid syntax in SSHHook (@robottokauf3)
  • #822 Merging prod hotfixes into master (@mistercrunch)
  • #824 Upgrade flask admin (@mistercrunch)
  • #823 Fixing bad ONE_FAILED in recent PR (@mistercrunch)
airflow -

Published by mistercrunch almost 9 years ago

1.6.2 is mostly a release with bug fixes and a few relatively minor features. Thanks to all contributors!

airflow - v1.6.1

Published by mistercrunch almost 9 years ago

  • Scheduler bugfix
airflow - v1.6.0

Published by mistercrunch almost 9 years ago

v1.6.0 brings:

  • [scheduler] the notion of DAG runs allows for more parallelization, and controls around scheduling (max number of running task instance per DAG, max number of DAG runs to be evaluated for scheduling, ...)
  • [scheduler] support for "externally triggered" DAG runs, or DAGs that run on demand as opposed to on a schedule
  • [scheduler] support for cron-like syntax (as in: "0 0 * * *") and macros (as in "@montlhly", "@hourly", "@weekly", ...)
  • UI changes related to new scheduler features
  • LDAP authentication for the web UI, more extensible authentication backend
  • UI activity logging
  • WebHdfsSensor and Hook for HDFS interactions that are py3 compatible
  • Continuous integration with Travis-UI and Coveralls
  • ShortCircuitOperator
  • python3 compatibility!
  • Tons of bug fixes and incremental improvements
  • + all the things I'm forgetting while browsing through an infinite list of commits!

Thanks to everyone in the community for all the PRs (stellar contributions!), comments and issue reporting.

airflow - v1.5.2

Published by mistercrunch almost 9 years ago

This is probably missing some important items, but most of it should be captured

  • Initial setup on Travis CI provides continuous Integration, automated testing, no Hadoop unit tests just yet, but that is coming up
  • Unit tests coverage reports with coverals
  • Better py3 compatibility, unit tests run against both 2.7 ad 3.4 and we're now using the from __future__ imports to prevent regression
  • A MesosExecutor to run your tasks on Mesos
  • Some Kerberos integration for Hive / Hadoop
  • DAG's graph view legend for states are now toggles to highlight tasks in specific states
  • Automated zombie task instance killing as part of the scheduler's routine. The process looks for running tasks that don't have a heartbeat and kills them
  • MySqlHook bulk load option
  • More options in the UI's Mark Success form
  • MySQL uses mysqlclient lib instead of mysql-python
  • Using gunicorn instead of tornado as the wsgi web server
  • OracleHook
  • FTPHook
  • Much more! tons of bug fixes and usability improvements.
airflow -

Published by mistercrunch about 9 years ago

Bugfix around XCom table creation timestamp issue

airflow -

Published by mistercrunch about 9 years ago

v1.5.0 is a huge release. Tons of important features.

Make sure to run airflow upgradedb after you upgrade

Improvements:

  • @jlowin landed a feature to communicate information across tasks called XCom, more information here
  • @neovintage integrated Airflow with Alembic, making database migration easy, run airflow upgradedb to get your database up to date as you upgrade Airflow
  • The dependency engine is now more flexible, allowing for trigger rules, before this update, tasks would only be triggered when all dependencies were successful (still the default), now you can set tasks to trigger when a single parent succeeds, when one fails, when they all fail, or to fire regardless of their dependencies:
  • @neovintage added support for connection to be defined in environment variables, allowing to bypass or override the metadata database
  • @jlowin improved the tree view to render a non expendable when trees go above 5k nodes
  • The Druid hook and HiveToDruidTransfer are maturing and becoming production grade
  • @kapil-malik added a UI feature where some users can only see their own DAGs while superusers can still see all, this feature is turned off by default
  • Allowing all operators and some preoperators to receive list of sql statements to be execute as a chain
  • Improved the task_instance table to log the operator name and queued timestamp
  • Passwords in the metadata can now be encrypted
  • Improvements to the unit tests (speed + coverage)
  • Bugfixes + more
airflow -

Published by mistercrunch about 9 years ago

  • Python3 compatibility improvements
  • + TimeDeltaSensor
  • + Slack related hooks and operators
  • Fancy widget for pausing DAGs from the main dash
  • Bugfixes and polish!