Logprep

log data pre processing in python

LGPL-2.1 License

Downloads
2.3K
Stars
26
Committers
13

Bot releases are visible (Hide)

Logprep - v10.0.0

Published by ekneg54 9 months ago

v10.0.0

Breaking

  • reimplement the logprep CLI, see logprep --help for more information.
  • remove feature to reload configuration by sending signal SIGUSR1
  • remove feature to validate rules because it is already included in logprep test config

Features

  • add a number_of_successful_writes metric to the s3 connector, which counts how many events were successfully written to s3
  • make the s3 connector work with the new _write_backlog method introduced by the confluent_kafka commit bugfix in v9.0.0
  • add option to Opensearch Output Connector to use parallel bulk implementation (default is True)
  • add feature to logprep to load config from multiple sources (files or uris)
  • add feature to logprep to print the resulting configruation with logprep print json|yaml <Path to config> in json or yaml
  • add an event generator that can send records to Kafka using data from a file or from Kafka
  • add an event generator that can send records to a HTTP endpoint using data from local dataset

Improvements

  • a do nothing option do dummy output to ensure dummy does not fill memory
  • make the s3 connector raise FatalOutputError instead of warnings
  • make the s3 connector blocking by removing threading
  • revert the change from v9.0.0 to always check the existence of a field for negated key-value based lucene filter expressions
  • make store_custom in s3, opensearch and elasticsearch connector not call batch_finished_callback to prevent data loss that could be caused by partially processed events
  • remove the schema_and_rule_checker module
  • rewrite Logprep Configuration object see documentation for more details
  • rewrite Runner
  • delete MultiProcessingPipeline class to simplify multiprocesing
  • add FDA to the quickstart setup
  • bump versions for fastapi and aiohttp to address CVEs

Bugfix

  • make the s3 connector actually use the max_retries parameter
  • fixed a bug which leads to a FatalOutputError on handling CriticalInputError in pipeline

Details

New Contributors

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v9.0.3...v10.0.0

Logprep - Development Build

Published by github-actions[bot] 9 months ago

Commits

  • d1717d6: Add load-tester (#487) (ppcad) #487
Logprep - Development Build

Published by github-actions[bot] 9 months ago

Commits

  • 87e9e21: Set no offsets on store custom (#518) (ppcad) #518
Logprep - Development Build

Published by github-actions[bot] 9 months ago

Commits

  • 54623b2: Fix handling of CriticalInputError exceptions (#514) (clumsy9) #514
Logprep - Development Build

Published by github-actions[bot] 9 months ago

Commits

  • f1b8545: Revise CLI (#513) (dtrai2) #513
Logprep - Development Build

Published by github-actions[bot] 9 months ago

Commits

  • bf0e41f: add do nothing option to dummy output (#503) (Jörg Zimmermann) #503
Logprep - Development Build

Published by github-actions[bot] 10 months ago

Commits

  • 74623ce: Adapt s3 connector for kafka fix (#499) (ppcad) #499
Logprep - Development Build

Published by github-actions[bot] 10 months ago

Commits

  • d13e040: add architecture overview (#478) (djkhl) #478
Logprep - v9.0.3

Published by ekneg54 10 months ago

v9.0.3

Breaking

Features

  • make thread_count, queue_size and chunk_size configurable for parallel_bulk in opensearch output connector

Improvements

Bugfix

  • fix parallel_bulk implementation not delivering messages to opensearch

Details

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v9.0.2...v9.0.3

Logprep - v9.0.2

Published by dtrai2 11 months ago

Bugfix

  • remove duplicate pseudonyms in extra outputs of pseudonymizer

What's Changed

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v9.0.1...v9.0.2

Logprep - v9.0.1

Published by ekneg54 11 months ago

Details

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v9.0.0...v9.0.1

Logprep - v9.0.0

Published by ekneg54 11 months ago

v9.0.0

Breaking

  • remove possibility to inject auth credentials via url string, because of the risk leaking credentials in logs
    • if you want to use basic auth, then you have to set the environment variables
      • :code:LOGPREP_CONFIG_AUTH_USERNAME=<your_username>
      • :code:LOGPREP_CONFIG_AUTH_PASSWORD=<your_password>
    • if you want to use oauth, then you have to set the environment variables
      • :code:LOGPREP_CONFIG_AUTH_TOKEN=<your_token>
      • :code:LOGPREP_CONFIG_AUTH_METHOD=oauth

Features

Improvements

  • improve error message on empty rule filter
  • reimplemented pseudonymizer processor
    • rewrote tests till 100% coverage
    • cleaned up code
    • reimplemented caching using pythons lru_cache
    • add cache metrics
    • removed max_caching_days config option
    • add max_cached_pseudonymized_urls config option which defaults to 1000
    • add lru caching for peudonymizatin of urls
  • improve loading times for the rule tree by optimizing the rule segmentation and sorting
  • add support for python 3.12 and remove support for python 3.9
  • always check the existence of a field for negated key-value based lucene filter expressions

Bugfix

  • fix the rule tree parsing some rules incorrectly, potentially resulting in more matches
  • fix confluent_kafka commit issue after kafka did some rebalancing, fixes also negative offsets

Details

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v8.0.0...v9.0.0

Logprep - Development Build

Published by github-actions[bot] 11 months ago

Commits

  • 3527bfa: fix changelog (#479) (Jörg Zimmermann) #479
Logprep - Development Build

Published by github-actions[bot] 11 months ago

Commits

  • 491c14b: Improve error message on empty rule filter (#471) (dtrai2) #471
Logprep - v8.0.0

Published by ekneg54 11 months ago

v8.0.0

Breaking

  • reimplemented metrics so the former metrics configuration won't work anymore
  • metric content changed and existent grafana dashboards will break
  • new rule id could possibly break configurations if the same rule is used in both rule trees
    • can be fixed by adding a unique id to each rule or delete the possibly redundant rule

Features

  • add possibility to convert hex to int in calculator processor with new added function from_hex
  • add metrics on rule level
  • add grafana example dashboards under quickstart/exampledata/config/grafana/dashboards
  • add new configuration field id for all rules to identify rules in metrics and logs
    • if no id is given, the id will be generated in a stable way
    • add verification of rule id uniqueness on processor level over both rule trees to ensure metrics are counted correctly on rule level

Improvements

  • reimplemented prometheus metrics exporter to provide gauges, histograms and counter metrics
  • removed shared counter, because it is redundant to the metrics
  • get exception stack trace by setting environment variable DEBUG

Details

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v7.0.0...v8.0.0

Logprep - Development Build

Published by github-actions[bot] 12 months ago

Commits

  • 2aa70de: convert hex to int in calculator (#463) (Jörg Zimmermann) #463
Logprep - v7.0.0

Published by dtrai2 about 1 year ago

Breaking

  • removed metric file target
  • move kafka config options to kafka_config dictionary for confluent_kafka_input and confluent_kafka_output connectors

Features

  • add a preprocessor to enrich by systems env variables
  • add option to define rules inline in pipeline config under processor configs generic_rules or specific_rules
  • add option to field_manager to ignore missing source fields to suppress warnings and failure tags
  • add ignore_missing_source_fields behavior to calculator, concatenator, dissector, grokker, ip_informer, selective_extractor
  • kafka input connector
    • implemented manual commit behaviour if enable.auto.commit: false
    • implemented on_commit callback to check for errors during commit
    • implemented statistics callback to collect metrics from underlying librdkafka library
    • implemented per partition offset metrics
    • get logs and handle errors from underlying librdkafka library
  • kafka output connector
    • implemented statistics callback to collect metrics from underlying librdkafka library
    • get logs and handle errors from underlying librdkafka library

Improvements

  • pre_detector processor now adds the field creation_timestamp to pre-detections.
    It contains the time at which a pre-detection was created by the processor.
  • add prometheus and grafana to the quickstart setup to support development
  • provide confluent kafka test setup to run tests against a real kafka cluster

Bugfix

  • fix CVE-2023-37920 Removal of e-Tugra root certificate
  • fix CVE-2023-43804 Cookie HTTP header isn't stripped on cross-origin redirects
  • fix CVE-2023-37276 aiohttp.web.Application vulnerable to HTTP request smuggling via llhttp HTTP request parser

Details

Full Changelog: https://github.com/fkie-cad/Logprep/compare/v6.8.1...v7.0.0

Logprep - Development Build

Published by github-actions[bot] about 1 year ago

Commits

  • 66ed09a: prepare release 7.0.0 (#461) (dtrai2) #461
Logprep - Development Build

Published by github-actions[bot] about 1 year ago

Commits

  • 9458a24: remove MultiprocessLogHandler (#455) (Jörg Zimmermann) #455
Logprep - Development Build

Published by github-actions[bot] about 1 year ago

Commits

  • 02448cd: Remove MetricFileTarget (#456) (Jörg Zimmermann) #456
Package Rankings
Top 16.14% on Pypi.org
Badges
Extracted from project README
Documentation Status