Funcy with pipeline-based operators

If Funcy and Pipe had a baby. Deal with data transformation in python in a sane way.

I love Ruby. It's a great language and one of the things they got right was pipelined data transformation. Elixir got this even more right with the explicit pipeline operator |>.

However, Python is the way of the future. As I worked more with Python, it was driving me nuts that the data transformation options were not chainable.

This project fixes this pet peeve.

Installation

pip install funcy-pipe

Or, if you are using poetry:

poetry add funcy-pipe

Examples

Extract a couple key values from a sql alchemy model:

import funcy_pipe as fp

entities_from_sql_alchemy
  | fp.lmap(lambda r: r.to_dict())
  | fp.lmap(lambda r: r | fp.omit(["id", "created_at", "updated_at"]))
  | fp.to_list

Or, you can be more fancy and use whatever and pmap:

import funcy_pipe as f
import whatever as _

entities_from_sql_alchemy
  | fp.lmap(_.to_dict)
  | fp.pmap(fp.omit(["id", "created_at", "updated_at"]))
  | fp.to_list

Create a map from an array of objects, ensuring the key is always an int:

section_map = api.get_sections() | fp.group_by(f.compose(int, that.id))

Grab the ID of a specific user:

filter_user_id = (
  collaborator_map().values()
  | fp.where(email=target_user)
  | fp.pluck("id")
  | fp.first()
)

Get distinct values from a list (in this case, github events):

events = [
  {
    "type": "PushEvent"
  },
  {
    "type": "CommentEvent"
  }
]

result = events | fp.pluck("type") | fp.distinct() | fp.to_list()

assert ["PushEvent", "CommentEvent"] == result

What if the objects are not dicts?

filter_user_id = (
  collaborator_map().values()
  | fp.where_attr(email=target_user)
  | fp.pluck_attr("id")
  | fp.first()
)

How about creating a dict where each value is sorted:

data
  # each element is a dict of city information, let's group by state
  | fp.group_by(itemgetter("state_name"))
  # now let's sort each value by population, which is stored as a string
  | fp.walk_values(
    f.partial(sorted, reverse=True, key=lambda c: int(c["population"])),
  )

A more complicated example (lifted from this project):

comments = (
    # tasks are pulled from the todoist api
    tasks
    # get all comments for each relevant task
    | fp.lmap(lambda task: api.get_comments(task_id=task.id))
    # each task's comments are returned as an array, let's flatten this
    | fp.flatten()
    # dates are returned as strings, let's convert them to datetime objects
    | fp.lmap(enrich_date)
    # no date filter is applied by default, we don't want all comments
    | fp.lfilter(lambda comment: comment["posted_at_date"] > last_synced_date)
    # comments do not come with who created the comment by default, we need to hit a separate API to add this to the comment
    | fp.lmap(enrich_comment)
    # only select the comments posted by our target user
    | fp.lfilter(lambda comment: comment["posted_by_user_id"] == filter_user_id)
    # there is no `sort` in the funcy library, so we reexport the sort built-in so it's pipe-able
    | fp.sort(key="posted_at_date")
    # create a dictionary of task_id => [comments]
    | fp.group_by(lambda comment: comment["task_id"])
)

Want to grab the values of a list of dict keys?

def add_field_name(input: dict, keys: list[str]) -> dict:
    return input | {
        "field_name": (
            keys
            # this is a sneaky trick: if we reference the objects method, when it's called it will contain a reference
            # to the object
            | fp.map(input.get)
            | fp.compact
            | fp.join_str("_")
        )
    }

result = [{ "category": "python", "header": "functional"}] | fp.map(fp.rpartial(add_field_name, ["category", "header"])) | fp.to_list
assert result == [{'category': 'python', 'header': 'functional', 'field_name': 'python_functional'}]

You can also easily test multiple conditions across API data (extracted from this project)

all_checks_successful = (
    last_commit.get_check_runs()
    | fp.pluck_attr("conclusion")
    # if you pass a set into `all` each element of the set is used to build a predicate
    # this condition tests if the "conclusion" attribute is either "success" or "skipped"
    | fp.all({"success", "skipped"})
)

Want to grab the values of a list of dict keys?

def add_field_name(input: dict, keys: list[str]) -> dict:
    return input | {
        "field_name": (
            keys
            # this is a sneaky trick: if we reference the objects method, when it's called it will contain a reference
            # to the object
            | fp.map(input.get)
            | fp.compact
            | fp.join_str("_")
        )
    }

result = [{ "category": "python", "header": "functional"}] | fp.map(fp.rpartial(add_field_name, ["category", "header"])) | fp.to_list
assert result == [{'category': 'python', 'header': 'functional', 'field_name': 'python_functional'}]

You can also easily group dictionaries by a key (or arbitrary function):

import operator

result = [{"age": 10, "name": "Alice"}, {"age": 12, "name": "Bob"}] | fp.group_by(operator.itemgetter("age"))
assert result == {10: [{'age': 10, 'name': 'Alice'}], 12: [{'age': 12, 'name': 'Bob'}]}

Extras

to_list
log
bp. run breakpoint() on the input value
sort
exactly_one. Throw an error if the input is not exactly one element
reduce
pmap. Pass each element of a sequence into a pipe'd function

Extensions

There are some functions which are not yet merged upstream into funcy, and may never be. You can patch funcy to add them using:

import funcy_pipe
funcy_pipe.patch()

Coming From Ruby?

uniq => distinct
detect => where(some="Condition") | first or where_attr(some="Condition") | first
inverse => complement
times => repeatedly

Module Alias

Create a module alias for funcy-pipe to make things clean (import * always irks me):

# fp.py
from funcy_pipe import *

# code py
import fp

Inspiration

Elixir's pipe operator. array |> map(fn) |> filter(fn)
Ruby's enumerable library. array.map(&:fn).filter(&:fn)
https://pypi.org/project/funcy-chain
https://github.com/JulienPalard/Pipe

TODO

tests
docs for additional utils
fix typing threading

Package Rankings

Top 23.69% on Pypi.org

Badges

Extracted from project README

Related Projects

python-benedict

dict subclass with keylist/keypath support, built-in I/O operations (base64, csv, html, ini, json...

17 May 2019 1,430

python-fields

A totally different take on container boilerplate.

26 Jun 2014 137

oomph

Yet another attempt at making a usable programming language

02 Mar 2021 4

pyupgrade

A tool (and pre-commit hook) to automatically upgrade syntax for newer versions of the language.

28 Feb 2017 3,558

Expression

Pragmatic functional programming for Python inspired by F#

23 Sep 2020 405

CLU

Common Lightweight Utilities, or Command-Line Utilities (your pick)

16 Jun 2019 3

function-pattern-matching

Pattern matching and guards for Python functions

11 May 2016 30

dgpy-libs

Dynamic Graphics Python libraries/modules

29 May 2020 23

pipefunc

Lightweight function pipeline creation: 📚 Less Bookkeeping, 🎯 More Doing

16 Jul 2023 7

symbex

Find the Python code for specified symbols

18 Jun 2023 230

elements-of-python-style

Goes beyond PEP8 to discuss what makes Python code feel great. A Strunk & White for Python.

03 Jan 2016 3,447

fn.py

Functional programming in Python: implementation of missing features to enjoy FP

13 Jan 2013 3,350

PyFunctional

Python library for creating data pipelines with chain functional programming

05 Feb 2015 2,279

PipeChain

Functional pipelines in Python using method chaining

26 Oct 2021 3

result

A simple Rust like Result type for Python 3. Fully type annotated.

14 Dec 2015 1,495