# -*- coding: utf-8 -*-
from setuptools import setup

packages = \
['pipda']

package_data = \
{'': ['*']}

install_requires = \
['diot', 'executing', 'pure_eval<1.0.0', 'varname']

setup_kwargs = {
    'name': 'pipda',
    'version': '0.5.5',
    'description': 'A framework for data piping in python',
    'long_description': '# pipda\n\n[![Pypi][7]][8] [![Github][9]][10] [![PythonVers][11]][8] [![Codacy][16]][14] [![Codacy coverage][15]][14] ![Docs building][13] ![Building][12]\n\nA framework for data piping in python\n\nInspired by [siuba][1], [dfply][2], [plydata][3] and [dplython][4], but with simple yet powerful APIs to mimic the `dplyr` and `tidyr` packages in python\n\n\n[API][17] | [Change Log][18] | [Playground][19]\n\n## Installation\n```shell\npip install -U pipda\n```\n\n## Usage\n\nCheckout [datar][6] for more detailed usages.\n\n### Verbs\n\nVerbs are functions next to the piping sign (`>>`) receiving the data directly.\n\n```python\nimport pandas as pd\nfrom pipda import (\n    register_verb,\n    register_func,\n    register_operator,\n    evaluate_expr,\n    Operator,\n    Symbolic,\n    Context\n)\n\nf = Symbolic()\n\ndf = pd.DataFrame({\n    \'x\': [0, 1, 2, 3],\n    \'y\': [\'zero\', \'one\', \'two\', \'three\']\n})\n\ndf\n\n#      x    y\n# 0    0    zero\n# 1    1    one\n# 2    2    two\n# 3    3    three\n\n@register_verb(pd.DataFrame)\ndef head(data, n=5):\n    return data.head(n)\n\ndf >> head(2)\n#      x    y\n# 0    0    zero\n# 1    1    one\n\n@register_verb(pd.DataFrame, context=Context.EVAL)\ndef mutate(data, **kwargs):\n    data = data.copy()\n    for key, val in kwargs.items():\n        data[key] = val\n    return data\n\ndf >> mutate(z=1)\n#    x      y  z\n# 0  0   zero  1\n# 1  1    one  1\n# 2  2    two  1\n# 3  3  three  1\n\ndf >> mutate(z=f.x)\n#    x      y  z\n# 0  0   zero  0\n# 1  1    one  1\n# 2  2    two  2\n# 3  3  three  3\n\n# Verbs that don\'t compile f.a to data, but just the column name\n@register_verb(pd.DataFrame, context=Context.SELECT)\ndef select(data, *columns):\n    return data.loc[:, columns]\n\n# f.x won\'t be compiled as df.x but just \'x\'\ndf >> mutate(z=2*f.x) >> select(f.x, f.z)\n#      x    z\n# 0    0    0\n# 1    1    2\n# 2    2    4\n# 3    3    6\n\n# Compile the args inside the verb\n@register_verb(pd.DataFrame, context=Context.PENDING)\ndef mutate_existing(data, column, value):\n    column = evaluate_expr(column, data, Context.SELECT)\n    value = evaluate_expr(value, data, Context.EVAL)\n    data = data.copy()\n    data[column] = value\n    return data\n\n# First f.x compiled as column name, and second as Series data\ndf2 = df >> mutate_existing(f.x, 10 * f.x)\ndf2\n#      x    y     z\n# 0    0    zero  0\n# 1    10   one   2\n# 2    20   two   4\n# 3    30   three 6\n\n# Evaluate the arguments by yourself\n@register_verb(pd.DataFrame, context=Context.PENDING)\ndef mutate_existing2(data, column, value):\n    column = evaluate_expr(column, data, Context.SELECT)\n    value = evaluate_expr(value, df2, Context.EVAL)\n    data = data.copy()\n    data[column] = value\n    return data\n\ndf >> mutate_existing2(f.x, 2 * f.x)\n#      x    y\n# 0    0    zero\n# 1    20   one\n# 2    40   two\n# 3    60   three\n\n# register for multiple types\n@register_verb(int)\ndef add(data, other):\n    return data + other\n\n# add is actually a singledispatch generic function\n@add.register(float)\ndef _(data, other):\n    return data * other\n\n1 >> add(1)\n# 2\n1.1 >> add(1.0)\n# 1.1\n\n# As it\'s a singledispatch generic function, we can do it for multiple types\n# with the same logic\n@register_verb(context=Context.EVAL)\ndef mul(data, other):\n    raise NotImplementedError # not invalid until types registered\n\n@mul.register(int)\n@mul.register(float)\n# or you could do @mul.register((int, float))\n# context is also supported\ndef _(data, other):\n    return data * other\n\n3 >> mul(2)\n# 6\n3.2 >> mul(2)\n# 6.4\n```\n\n### Functions used in verb arguments\n```python\n@register_func(context=Context.EVAL)\ndef if_else(data, cond, true, false):\n    cond.loc[cond.isin([True]), ] = true\n    cond.loc[cond.isin([False]), ] = false\n    return cond\n\n# The function is then also a singledispatch generic function\n\ndf >> mutate(z=if_else(f.x>1, 20, 10))\n#    x      y   z\n# 0  0   zero  10\n# 1  1    one  10\n# 2  2    two  20\n# 3  3  three  20\n```\n\n```python\n# function without data argument\n@register_func(None)\ndef length(strings):\n    return [len(s) for s in strings]\n\ndf >> mutate(z=length(f.y))\n\n#    x     y    z\n# 0  0  zero    4\n# 1  1   one    3\n# 2  2   two    3\n# 3  3 three    5\n```\n\n```python\n# register existing functions\nfrom numpy import vectorize\nlen = register_func(None, context=Context.EVAL, func=vectorize(len))\n\n# original function still works\nprint(len(\'abc\'))\n\ndf >> mutate(z=len(f.y))\n\n# 3\n#   x     y z\n# 0 0  zero 4\n# 1 1   one 3\n# 2 2   two 3\n# 3 3 three 5\n```\n\n### Operators\nYou may also redefine the behavior of the operators\n```python\n@register_operator\nclass MyOperators(Operator):\n    def xor(self, a, b):\n        """Inteprete X ^ Y as pow(X, Y)."""\n        return a ** b\n\ndf >> mutate(z=f.x ^ 2)\n#      x    y      z\n# 0    0    zero   0\n# 1    1    one    1\n# 2    2    two    4\n# 3    3    three  9\n```\n\n### Context\n\nThe context defines how a reference (`f.A`, `f[\'A\']`, `f.A.B` is evaluated)\n\n```python\nfrom pipda import ContextBase\n\nclass MyContext(ContextBase):\n    name = \'my\'\n    def getattr(self, parent, ref):\n        # double it to distinguish getattr\n        return getattr(parent, ref)\n    def getitem(self, parent, ref):\n        return parent[ref] * 2\n    @property\n    def ref(self):\n        # how we evaluate the ref in f[ref]\n        return self\n\n\n@register_verb(context=MyContext())\ndef mutate_mycontext(data, **kwargs):\n    for key, val in kwargs.items():\n        data[key] = val\n    return data\n\ndf >> mutate_mycontext(z=f.x + f[\'x\'])\n\n#   x     y z\n# 0 0  zero 0\n# 1 1   one 3\n# 2 2   two 6\n# 3 3 three 9\n```\n\n```python\n# when ref in f[ref] is also needed to be evaluated\ndf = df >> mutate(zero=0, one=1, two=2, three=3)\ndf\n\n#    x      y  z  zero  one  two  three\n# 0  0   zero  0     0    1    2      3\n# 1  1    one  3     0    1    2      3\n# 2  2    two  6     0    1    2      3\n# 3  3  three  9     0    1    2      3\n```\n\n```python\ndf >> mutate_mycontext(m=f[f.y][:1].values[0])\n# f.y returns [\'zero\', \'one\', \'two\', \'three\']\n# f[f.y] gets [[0, 2, 4, 6], [0, 2, 4, 6], [0, 2, 4, 6], [0, 2, 4, 6]]\n# f[f.y][:1].values gets [[0, 4, 8, 16]]\n# f[f.y][:1].values[0] returns [0, 8, 16, 32]\n# Notes that each subscription ([]) will double the values\n\n#    x      y  z  zero  one  two  three   m\n# 0  0   zero  0     0    1    2      3   0\n# 1  1    one  3     0    1    2      3   8\n# 2  2    two  6     0    1    2      3  16\n# 3  3  three  9     0    1    2      3  24\n```\n\n### Calling rules\n\n#### Verb calling rules\n\n1. `data >> verb(...)`\\\n    [PIPING_VERB]\\\n    First argument should not be passed, using the data\n2. `data >> other_verb(verb(...))`\\\n   `other_verb(data, verb(...))`\\\n   `registered_func(verb(...))`\\\n    [PIPING]\\\n    Try using the first argument to evaluate (FastEvalVerb), if first argument\n    is data. Otherwise, if it is Expression object, works as a non-data\n    Function.\n3. `verb(...)`\\\n    Called independently. The verb will be called regularly anyway.\n    The first argument will be used as data to evaluate the arguments\n    if there are any Expression objects\n4. `verb(...)` with DataEnv\\\n    First argument should not be passed in, will use the DataEnv\'s data\n    to evaluate the arguments\n\n#### Data function calling rules\n\nFunctions that require first argument as data argument.\n\n1. `data >> verb(func(...))` or `verb(data, func(...))`\\\n    First argument is not used. Will use data\n2. `func(...)`\\\n    Called independently. The function will be called regularly anyway.\n    Similar as Verb calling rule, but first argument will not be used for\n    evaluation\n3. `func(...)` with DataEnv\\\n    First argument not used, passed implicitly with DataEnv.\n\n### Non-data function calling rules:\n\n1. `data >> verb(func(...))` or `verb(data, func(...))`\\\n    Return a Function object waiting for evaluation\n2. `func(...)`\\\n    Called regularly anyway\n3. `func(...) with DataEnv`\\\n    Evaluate with DataEnv. For example: mean(f.x)\n\n### Caveats\n\n- You have to use and_ and or_ for bitwise and/or (`&`/`|`) operators, as and and or are python keywords.\n\n- Limitations:\n\n    Any limitations apply to `executing` to detect the AST node will apply to `pipda`. It may not work in some circumstances where other AST magics apply.\n\n    **What if source code is not available?**\n\n    `executing` does not work in the case where source code is not available, as there is no way to detect the AST node to check how the functions (verbs, data functions, non-data functions) are called, either they are called as a piping verb (`data >> verb(...)`), or they are called as an argument of a verb (`data >> verb(func(...))`) or even they are called independently/regularly.\n\n    In such a case, you can set the option (`options.assume_all_piping=True` (`pipda` `v0.4.4+`)) to assume that all registered functions are called in piping mode, so that you can do `data >> verb(...)` without any changes.\n\n    You can also use this option to enhance the performance by skipping detection of the calling environment.\n\n- Use another piping sign\n\n    ```python\n    from pipda import register_piping\n    register_piping(\'^\')\n\n    # register verbs and functions\n    df ^ verb1(...) ^ verb2(...)\n    ```\n\n    Allowed signs are: `+`, `-`, `*`, `@`, `/`, `//`, `%`, `**`, `<<`, `>>`, `&`, `^` and `|`.\n\n    Note that to use the new  piping sign, you have to register the verbs after the new piping sign being registered.\n\n- The context\n\n    The context is only applied to the `DirectReference` objects or unary operators, like `-f.A`, `+f.A`, `~f.A`, `f.A`, `f[\'A\']`, `[f.A, f.B]`, etc. Any other `Expression` wrapping those objects or other operators getting involved will turn the context to `Context.EVAL`\n\n## How it works\n### The verbs\n```R\ndata %>% verb(arg1, ..., key1=kwarg1, ...)\n```\nThe above is a typical `dplyr`/`tidyr` data piping syntax.\n\nThe counterpart R syntax we expect is:\n```python\ndata >> verb(arg1, ..., key1=kwarg1, ...)\n```\nTo implement that, we need to defer the execution of the `verb` by turning it into a `Verb` object, which holds all information of the function to be executed later. The `Verb` object won\'t be executed until the `data` is piped in. It all thanks to the [`executing`][5] package to let us determine the ast nodes where the function is called. So that we are able to determine whether the function is called in a piping mode.\n\nIf an argument is referring to a column of the data and the column will be involved in the later computation, the it also needs to be deferred. For example, with `dplyr` in `R`:\n```R\ndata %>% mutate(z=a)\n```\nis trying add a column named `z` with the data from column `a`.\n\nIn python, we want to do the same with:\n```python\ndata >> mutate(z=f.a)\n```\nwhere `f.a` is a `Reference` object that carries the column information without fetching the data while python sees it immmediately.\n\nHere the trick is `f`. Like other packages, we introduced the `Symbolic` object, which will connect the parts in the argument and make the whole argument an `Expression` object. This object is holding the execution information, which we could use later when the piping is detected.\n\n### The functions\nThen what if we want to use some functions in the arguments of the `verb`?\nFor example:\n```python\ndata >> select(starts_with(\'a\'))\n```\nto select the columns with names start with `\'a\'`.\n\nNo doubt that we need to defer the execution of the function, too. The trick is that we let the function return a `Function` object as well, and evaluate it as the argument of the verb.\n\n### The operators\n`pipda` also opens oppotunities to change the behavior of the operators in verb/function arguments. This allows us to mimic something like this:\n```python\ndata >> select(-f.a) # select all columns but `a`\n```\n\nTo do that, we turn it into an `Operator` object. Just like a `Verb` or a `Function` object, the execution is deferred. By default, the operators we used are from the python standard library `operator`. `operator.neg` in the above example.\n\nYou can also define you own by subclassing the `Operator` class, and then register it to replace the default one by decorating it with `register_operator`.\n\n\n[1]: https://github.com/machow/siuba\n[2]: https://github.com/kieferk/dfply\n[3]: https://github.com/has2k1/plydata\n[4]: https://github.com/dodger487/dplython\n[5]: https://github.com/alexmojaki/executing\n[6]: https://github.com/pwwang/datar\n[7]: https://img.shields.io/pypi/v/pipda?style=flat-square\n[8]: https://pypi.org/project/pipda/\n[9]: https://img.shields.io/github/v/tag/pwwang/pipda?style=flat-square\n[10]: https://github.com/pwwang/pipda\n[11]: https://img.shields.io/pypi/pyversions/pipda?style=flat-square\n[12]: https://img.shields.io/github/workflow/status/pwwang/pipda/Build%20and%20Deploy?style=flat-square\n[13]: https://img.shields.io/github/workflow/status/pwwang/pipda/Build%20Docs?style=flat-square\n[14]: https://app.codacy.com/gh/pwwang/pipda/dashboard\n[15]: https://img.shields.io/codacy/coverage/75d312da24c94bdda5923627fc311a99?style=flat-square\n[16]: https://img.shields.io/codacy/grade/75d312da24c94bdda5923627fc311a99?style=flat-square\n[17]: https://pwwang.github.io/pipda/api/pipda/\n[18]: https://pwwang.github.io/pipda/changelog/\n[19]: https://mybinder.org/v2/gh/pwwang/pipda/master?filepath=README.ipynb\n[20]: https://pwwang.github.io/datar/piping_vs_regular/\n',
    'author': 'pwwang',
    'author_email': 'pwwang@pwwang.com',
    'maintainer': None,
    'maintainer_email': None,
    'url': None,
    'packages': packages,
    'package_data': package_data,
    'install_requires': install_requires,
    'python_requires': '>=3.7,<4.0',
}


setup(**setup_kwargs)
