entie package
entie
Public API for entie: MongoDB helpers built on entei-core.
entei-core provides collection-to-columnar materialization; entie adds PyMongo connection helpers, bulk row inserts, a small lazy DataFrame-style API, and expression helpers. Pure Python on top of PyMongo (no native stack).
EnteiDataFrame
Lazy view over a collection; filters and projection run in Python on :meth:collect.
Reads the collection once via :func:entei_core.mongo_root_to_column_dict
(full find()), then applies filter_rows predicates and select
column order. Not a streaming or server-side aggregation API.
Source code in packages/entie/src/entie/dataframe.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
__init__(collection, *, fields=None, filters=(), projection=None)
Use :meth:from_collection to construct; constructor is for chaining.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection
|
Any
|
PyMongo collection (or compatible) scanned on |
required |
fields
|
tuple[str, ...] | None
|
Column names passed to :class: |
None
|
filters
|
tuple[Callable[[dict[str, Any]], bool], ...]
|
Predicates applied in order to row dicts after materialization. |
()
|
projection
|
tuple[str, ...] | None
|
If set, output columns are restricted to these names (after filters). |
None
|
Source code in packages/entie/src/entie/dataframe.py
37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 | |
collect(*, as_lists=True)
collect(*, as_lists: Literal[True] = True) -> dict[str, list[Any]]
collect(*, as_lists: Literal[False]) -> list[dict[str, Any]]
Materialize: scan collection, apply filters, then optional projection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
as_lists
|
bool
|
If |
True
|
Returns:
| Type | Description |
|---|---|
dict[str, list] or list[dict]
|
Columnar or row-oriented result consistent with |
Source code in packages/entie/src/entie/dataframe.py
149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 | |
filter_rows(predicate)
Return a frame that keeps rows where predicate(row) is true.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
predicate
|
Callable[[dict[str, Any]], bool]
|
Called with each row as a |
required |
Returns:
| Type | Description |
|---|---|
EnteiDataFrame
|
New frame with |
Source code in packages/entie/src/entie/dataframe.py
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | |
from_collection(collection, *, fields=None)
classmethod
Build a frame from a PyMongo (or mongomock) collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection
|
Any
|
Collection whose documents are read when :meth: |
required |
fields
|
Sequence[str] | None
|
Ordered top-level field names. If |
None
|
Returns:
| Type | Description |
|---|---|
EnteiDataFrame
|
Lazy frame; call :meth: |
See Also
entei_core.mongo_root.MongoRoot : Semantics of fields and empty collections.
Source code in packages/entie/src/entie/dataframe.py
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 | |
select(*columns)
Keep only the given output columns (applied after filter_rows).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*columns
|
str
|
One or more distinct field names. |
()
|
Returns:
| Type | Description |
|---|---|
EnteiDataFrame
|
New frame with projection set. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no columns, or if any name is duplicated. |
Source code in packages/entie/src/entie/dataframe.py
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 | |
EntieDatabase
Handle for a single MongoDB database (PyMongo Database).
Must wrap a PyMongo :class:~pymongo.database.Database, not a
:class:~pymongo.collection.Collection. Use :meth:collection / :meth:table
to obtain collections by name.
Source code in packages/entie/src/entie/client.py
70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | |
raw
property
The underlying PyMongo :class:~pymongo.database.Database.
__init__(db)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
db
|
Any
|
PyMongo :class: |
required |
Source code in packages/entie/src/entie/client.py
80 81 82 83 84 85 86 | |
collection(name)
Return the named :class:~pymongo.collection.Collection.
Source code in packages/entie/src/entie/client.py
93 94 95 | |
list_collection_names()
List collection names in this database (see PyMongo list_collection_names).
Source code in packages/entie/src/entie/client.py
101 102 103 | |
table(name)
Alias for :meth:collection (collection-as-table naming).
Source code in packages/entie/src/entie/client.py
97 98 99 | |
tables()
Same as :meth:list_collection_names.
Source code in packages/entie/src/entie/client.py
105 106 107 | |
EntieMongoClient
Thin wrapper around :class:pymongo.mongo_client.MongoClient.
Use :meth:database to get an :class:EntieDatabase, then
:meth:EntieDatabase.collection or :meth:EntieDatabase.table for
:meth:EnteiDataFrame.from_collection and inserts.
Source code in packages/entie/src/entie/client.py
13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | |
raw
property
The underlying :class:~pymongo.mongo_client.MongoClient.
__enter__()
Enter context: returns self (caller should close() on exit).
Source code in packages/entie/src/entie/client.py
61 62 63 | |
__exit__(*_args)
Exit context: calls :meth:close.
Source code in packages/entie/src/entie/client.py
65 66 67 | |
__init__(client)
Wrap an existing PyMongo client.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
client
|
MongoClient[Any]
|
Connected :class: |
required |
Source code in packages/entie/src/entie/client.py
23 24 25 26 27 28 29 30 31 | |
close()
Close the underlying PyMongo client (releases sockets).
Source code in packages/entie/src/entie/client.py
57 58 59 | |
database(name, *, codec_options=None)
Return a database by name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
MongoDB database name. |
required |
codec_options
|
Any | None
|
Optional BSON :class: |
None
|
Returns:
| Type | Description |
|---|---|
EntieDatabase
|
Wrapper around |
Source code in packages/entie/src/entie/client.py
38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
MongoRoot
dataclass
Carrier for a collection plus optional fixed column list for materialization.
Used with :func:~entei_core.mongo_root_to_column_dict to produce
dict[str, list] with one list per top-level field.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
collection
|
Any
|
A PyMongo :class: |
required |
fields
|
tuple[str, ...] | None
|
Column order and membership for output. If |
None
|
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in packages/entei-core/src/entei_core/mongo_root.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 | |
__post_init__()
Validate fields invariants.
Source code in packages/entei-core/src/entei_core/mongo_root.py
36 37 38 39 | |
Records
Rows staged for insertion into a MongoDB collection via PyMongo insert_many.
Source code in packages/entie/src/entie/io/records.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
__init__(rows, *, database)
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rows
|
list[dict[str, Any]]
|
BSON-compatible document dicts to insert. |
required |
database
|
EntieDatabase
|
Target database; collection is chosen in :meth: |
required |
Source code in packages/entie/src/entie/io/records.py
15 16 17 18 19 20 21 22 23 24 | |
from_list(rows, *, database)
classmethod
Copy rows into a new list and wrap with database.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rows
|
list[dict[str, Any]]
|
Documents to insert (shallow-copied list; dicts are not deep-copied). |
required |
database
|
EntieDatabase
|
:class: |
required |
Returns:
| Type | Description |
|---|---|
Records
|
Call :meth: |
Source code in packages/entie/src/entie/io/records.py
26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | |
insert_into(table)
Insert all rows into the named collection.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
table
|
str
|
Collection name on |
required |
Returns:
| Type | Description |
|---|---|
InsertManyResult or None
|
PyMongo |
Source code in packages/entie/src/entie/io/records.py
49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 | |
col(name)
Return name unchanged (readable select(col("x"))-style spelling).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Column / field name. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The same string |
Source code in packages/entie/src/entie/expressions/__init__.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 | |
column(name)
Alias of :func:col.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Column / field name. |
required |
Returns:
| Type | Description |
|---|---|
str
|
The same string |
Source code in packages/entie/src/entie/expressions/__init__.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 | |
connect(uri=None, *, database=None, client=None, **client_kwargs)
connect(uri: str | None = None, *, database: str, client: MongoClient[Any] | None = None, **client_kwargs: Any) -> EntieDatabase
connect(uri: str | None = None, *, database: None = None, client: MongoClient[Any] | None = None, **client_kwargs: Any) -> EntieMongoClient
Open a MongoDB connection or wrap an existing client.
If client is omitted, builds a new :class:~pymongo.mongo_client.MongoClient
from uri, or from the ENTIE_URI environment variable when uri is omitted.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
uri
|
str | None
|
MongoDB connection URI. Ignored when |
None
|
database
|
str | None
|
If set, returns :class: |
None
|
client
|
MongoClient[Any] | None
|
Existing client to wrap. Do not pass |
None
|
**client_kwargs
|
Any
|
Forwarded to :class: |
{}
|
Returns:
| Type | Description |
|---|---|
EntieMongoClient or EntieDatabase
|
Client wrapper, or database handle when |
Raises:
| Type | Description |
|---|---|
ValueError
|
If neither a resolvable URI nor |
Source code in packages/entie/src/entie/client.py
130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 | |
lit(value)
Return value unchanged (placeholder for literal-friendly APIs).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
value
|
Any
|
Any object. |
required |
Returns:
| Type | Description |
|---|---|
Any
|
The input |
Source code in packages/entie/src/entie/expressions/__init__.py
40 41 42 43 44 45 46 47 48 49 50 51 52 53 | |
materialize_root_data(data)
Normalize pipeline data: columnarize :class:MongoRoot, else identity.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
Any
|
Any value. If it is a :class: |
required |
Returns:
| Type | Description |
|---|---|
Any
|
Columnar dict or the original |
Source code in packages/entei-core/src/entei_core/_materialize.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | |
mongo_root_to_column_dict(root)
Run find() on root.collection and build aligned column lists.
Reads the entire cursor into memory. Only top-level keys participate; nested documents are values in a single cell.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
root
|
MongoRoot
|
Collection and optional |
required |
Returns:
| Type | Description |
|---|---|
dict[str, list]
|
Keys are field names; each value is the column in document order. |
Notes
When root.fields is None, keys are inferred from documents. When it is
an empty tuple, returns {} for any document count.
Source code in packages/entei-core/src/entei_core/_materialize.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | |