Core glom
API¶
glom gets results.
The glom
package has one central entrypoint,
glom.glom()
. Everything else in the package revolves around that
one function. Sometimes, big things come in small packages.
A couple of conventional terms you’ll see repeated many times below:
target - glom is built to work on any data, so we simply refer to the object being accessed as the “target”
spec - (aka “glomspec”, short for specification) The accompanying template used to specify the structure of the return value.
Now that you know the terms, let’s take a look around glom’s powerful semantics.
See also
As the glom API grows, we’ve refactored the docs into separate domains. The core API is below. More specialized types can also be found in the following docs:
Longtime glom docs readers: thanks in advance for reporting/fixing any broken links you may find.
Contents
The glom
Function¶
Where it all happens. The reason for the season. The eponymous
function, glom()
.
- glom.glom(target, spec, **kwargs)[source]¶
Access or construct a value from a given target based on the specification declared by spec.
Accessing nested data, aka deep-get:
>>> target = {'a': {'b': 'c'}} >>> glom(target, 'a.b') 'c'
Here the spec was just a string denoting a path,
'a.b.
. As simple as it should be. You can also useglob
-like wildcard selectors:>>> target = {'a': [{'k': 'v1'}, {'k': 'v2'}]} >>> glom(target, 'a.*.k') ['v1', 'v2']
In addition to
*
, you can also use**
for recursive access:>>> target = {'a': [{'k': 'v3'}, {'k': 'v4'}], 'k': 'v0'} >>> glom(target, '**.k') ['v0', 'v3', 'v4']
The next example shows how to use nested data to access many fields at once, and make a new nested structure.
Constructing, or restructuring more-complicated nested data:
>>> target = {'a': {'b': 'c', 'd': 'e'}, 'f': 'g', 'h': [0, 1, 2]} >>> spec = {'a': 'a.b', 'd': 'a.d', 'h': ('h', [lambda x: x * 2])} >>> output = glom(target, spec) >>> pprint(output) {'a': 'c', 'd': 'e', 'h': [0, 2, 4]}
glom
also takes a keyword-argument, default. When set, if aglom
operation fails with aGlomError
, the default will be returned, very much likedict.get()
:>>> glom(target, 'a.xx', default='nada') 'nada'
The skip_exc keyword argument controls which errors should be ignored.
>>> glom({}, lambda x: 100.0 / len(x), default=0.0, skip_exc=ZeroDivisionError) 0.0
- Parameters
target (object) – the object on which the glom will operate.
spec (object) – Specification of the output object in the form of a dict, list, tuple, string, other glom construct, or any composition of these.
default (object) – An optional default to return in the case an exception, specified by skip_exc, is raised.
skip_exc (Exception) – An optional exception or tuple of exceptions to ignore and return default (None if omitted). If skip_exc and default are both not set, glom raises errors through.
scope (dict) – Additional data that can be accessed via S inside the glom-spec. Read more: The glom Scope.
It’s a small API with big functionality, and glom’s power is only surpassed by its intuitiveness. Give it a whirl!
Basic Specifiers¶
Basic glom specifications consist of dict
, list
, tuple
,
str
, and callable
objects. However, as data calls for more
complicated interactions, glom
provides specialized specifier
types that can be used with the basic set of Python builtins.
- class glom.Path(*path_parts)[source]¶
Path objects specify explicit paths when the default
'a.b.c'
-style general access syntax won’t work or isn’t desirable. Use this to wrap ints, datetimes, and other valid keys, as well as strings with dots that shouldn’t be expanded.>>> target = {'a': {'b': 'c', 'd.e': 'f', 2: 3}} >>> glom(target, Path('a', 2)) 3 >>> glom(target, Path('a', 'd.e')) 'f'
Paths can be used to join together other Path objects, as well as
T
objects:>>> Path(T['a'], T['b']) T['a']['b'] >>> Path(Path('a', 'b'), Path('c', 'd')) Path('a', 'b', 'c', 'd')
Paths also support indexing and slicing, with each access returning a new Path object:
>>> path = Path('a', 'b', 1, 2) >>> path[0] Path('a') >>> path[-2:] Path(1, 2)
To build a Path object from a string, use
Path.from_text()
. This is the default behavior when the top-levelglom()
function gets a string spec.
- class glom.Val(value)[source]¶
Val objects are specs which evaluate to the wrapped value.
>>> target = {'a': {'b': 'c'}} >>> spec = {'a': 'a.b', 'readability': Val('counts')} >>> pprint(glom(target, spec)) {'a': 'c', 'readability': 'counts'}
Instead of accessing
'counts'
as a key like it did with'a.b'
,glom()
just unwrapped the Val and included the value.Val
takes one argument, the value to be returned.Note
Val
was namedLiteral
in versions of glom before 20.7.0. An alias has been preserved for backwards compatibility, but reprs have changed.
- class glom.Spec(spec, scope=None)[source]¶
Spec objects serve three purposes, here they are, roughly ordered by utility:
As a form of compiled or “curried” glom call, similar to Python’s built-in
re.compile()
.A marker as an object as representing a spec rather than a literal value in certain cases where that might be ambiguous.
A way to update the scope within another Spec.
In the second usage, Spec objects are the complement to
Val
, wrapping a value and marking that it should be interpreted as a glom spec, rather than a literal value. This is useful in places where it would be interpreted as a value by default. (Such as T[key], Call(func) where key and func are assumed to be literal values and not specs.)- Parameters
spec – The glom spec.
scope (dict) – additional values to add to the scope when evaluating this Spec
See also
Note that many of the Specifier types previously mentioned here have moved into their own docs, among them:
Object-Oriented Access and Method Calls with T¶
glom’s shortest-named feature may be its most powerful.
- glom.T = T¶
T
, short for “target”. A singleton object that enables object-oriented expression of a glom specification.Note
T
is a singleton, and does not need to be constructed.Basically, think of
T
as your data’s stunt double. Everything that you do toT
will be recorded and executed during theglom()
call. Take this example:>>> spec = T['a']['b']['c'] >>> target = {'a': {'b': {'c': 'd'}}} >>> glom(target, spec) 'd'
So far, we’ve relied on the
'a.b.c'
-style shorthand for access, or used thePath
objects, but if you want to explicitly do attribute and key lookups, look no further thanT
.But T doesn’t stop with unambiguous access. You can also call methods and perform almost any action you would with a normal object:
>>> spec = ('a', (T['b'].items(), list)) # reviewed below >>> glom(target, spec) [('c', 'd')]
A
T
object can go anywhere in the spec. As seen in the example above, we access'a'
, use aT
to get'b'
and iterate over itsitems
, turning them into alist
.You can even use
T
withCall
to construct objects:>>> class ExampleClass(object): ... def __init__(self, attr): ... self.attr = attr ... >>> target = {'attr': 3.14} >>> glom(target, Call(ExampleClass, kwargs=T)).attr 3.14
On a further note, while
lambda
works great in glom specs, and can be very handy at times,T
andCall
eliminate the need for the vast majority oflambda
usage with glom.Unlike
lambda
and other functions,T
roundtrips beautifully and transparently:>>> T['a'].b['c']('success') T['a'].b['c']('success')
T
-related access errors raise aPathAccessError
during theglom()
call.Note
While
T
is clearly useful, powerful, and here to stay, its semantics are still being refined. Currently, operations beyond method calls and attribute/item access are considered experimental and should not be relied upon.Note
T
attributes starting with __ are reserved to avoid colliding with many built-in Python behaviors, current and future. TheT.__()
method is available for cases where they are needed. For example,T.__('class__')
is equivalent to accessing the__class__
attribute.
Defaults with Coalesce¶
Data isn’t always where or what you want it to be. Use these specifiers to declare away overly branchy procedural code.
- class glom.Coalesce(*subspecs, **kwargs)[source]¶
Coalesce objects specify fallback behavior for a list of subspecs.
Subspecs are passed as positional arguments, and keyword arguments control defaults. Each subspec is evaluated in turn, and if none match, a
CoalesceError
is raised, or a default is returned, depending on the options used.Note
This operation may seem very familar if you have experience with SQL or even C# and others.
In practice, this fallback behavior’s simplicity is only surpassed by its utility:
>>> target = {'c': 'd'} >>> glom(target, Coalesce('a', 'b', 'c')) 'd'
glom tries to get
'a'
fromtarget
, but gets a KeyError. Rather than raise aPathAccessError
as usual, glom coalesces into the next subspec,'b'
. The process repeats until it gets to'c'
, which returns our value,'d'
. If our value weren’t present, we’d see:>>> target = {} >>> glom(target, Coalesce('a', 'b')) Traceback (most recent call last): ... CoalesceError: no valid values found. Tried ('a', 'b') and got (PathAccessError, PathAccessError) ...
Same process, but because
target
is empty, we get aCoalesceError
.Note
Coalesce is a branching specifier type, so as of v20.7.0, its exception messages feature an error tree. See Reading Branched Exceptions for details on how to interpret these exceptions.
If we want to avoid an exception, and we know which value we want by default, we can set default:
>>> target = {} >>> glom(target, Coalesce('a', 'b', 'c'), default='d-fault') 'd-fault'
'a'
,'b'
, and'c'
weren’t present so we got'd-fault'
.- Parameters
subspecs – One or more glommable subspecs
default – A value to return if no subspec results in a valid value
default_factory – A callable whose result will be returned as a default
skip – A value, tuple of values, or predicate function representing values to ignore
skip_exc – An exception or tuple of exception types to catch and move on to the next subspec. Defaults to
GlomError
, the parent type of all glom runtime exceptions.
If all subspecs produce skipped values or exceptions, a
CoalesceError
will be raised. For more examples, check out the glom Tutorial, which makes extensive use of Coalesce.
- glom.SKIP = Sentinel('SKIP')¶
The
SKIP
singleton can be returned from a function or included via aVal
to cancel assignment into the output object.>>> target = {'a': 'b'} >>> spec = {'a': lambda t: t['a'] if t['a'] == 'a' else SKIP} >>> glom(target, spec) {} >>> target = {'a': 'a'} >>> glom(target, spec) {'a': 'a'}
Mostly used to drop keys from dicts (as above) or filter objects from lists.
Note
SKIP was known as OMIT in versions 18.3.1 and prior. Versions 19+ will remove the OMIT alias entirely.
- glom.STOP = Sentinel('STOP')¶
The
STOP
singleton can be used to halt iteration of a list or execution of a tuple of subspecs.>>> target = range(10) >>> spec = [lambda x: x if x < 5 else STOP] >>> glom(target, spec) [0, 1, 2, 3, 4]
Calling Callables with Invoke¶
New in version 19.10.0.
From calling functions to constructing objects, it’s hardly Python if you’re not invoking callables. By default, single-argument functions work great on their own in glom specs. The function gets passed the target and it just works:
>>> glom(['1', '3', '5'], [int])
[1, 3, 5]
Zero-argument and multi-argument functions get a lot trickier,
especially when more than one of those arguments comes from the
target, thus the Invoke
spec.
- class glom.Invoke(func)[source]¶
Specifier type designed for easy invocation of callables from glom.
- Parameters
func (callable) – A function or other callable object.
Invoke
is similar tofunctools.partial()
, but with the ability to set up a “templated” call which interleaves constants and glom specs.For example, the following creates a spec which can be used to check if targets are integers:
>>> is_int = Invoke(isinstance).specs(T).constants(int) >>> glom(5, is_int) True
And this composes like any other glom spec:
>>> target = [7, object(), 9] >>> glom(target, [is_int]) [True, False, True]
Another example, mixing positional and keyword arguments:
>>> spec = Invoke(sorted).specs(T).constants(key=int, reverse=True) >>> target = ['10', '5', '20', '1'] >>> glom(target, spec) ['20', '10', '5', '1']
Invoke also helps with evaluating zero-argument functions:
>>> glom(target={}, spec=Invoke(int)) 0
(A trivial example, but from timestamps to UUIDs, zero-arg calls do come up!)
Note
Invoke
is mostly for functions, object construction, and callable objects. For calling methods, consider theT
object.- constants(*a, **kw)[source]¶
Returns a new
Invoke
spec, with the provided positional and keyword argument values stored for passing to the underlying function.>>> spec = Invoke(T).constants(5) >>> glom(range, (spec, list)) [0, 1, 2, 3, 4]
Subsequent positional arguments are appended:
>>> spec = Invoke(T).constants(2).constants(10, 2) >>> glom(range, (spec, list)) [2, 4, 6, 8]
Keyword arguments also work as one might expect:
>>> round_2 = Invoke(round).constants(ndigits=2).specs(T) >>> glom(3.14159, round_2) 3.14
constants()
and otherInvoke
methods may be called multiple times, just remember that every call returns a new spec.
- classmethod specfunc(spec)[source]¶
Creates an
Invoke
instance where the function is indicated by a spec.>>> spec = Invoke.specfunc('func').constants(5) >>> glom({'func': range}, (spec, list)) [0, 1, 2, 3, 4]
- specs(*a, **kw)[source]¶
Returns a new
Invoke
spec, with the provided positional and keyword arguments stored to be interpreted as specs, with the results passed to the underlying function.>>> spec = Invoke(range).specs('value') >>> glom({'value': 5}, (spec, list)) [0, 1, 2, 3, 4]
Subsequent positional arguments are appended:
>>> spec = Invoke(range).specs('start').specs('end', 'step') >>> target = {'start': 2, 'end': 10, 'step': 2} >>> glom(target, (spec, list)) [2, 4, 6, 8]
Keyword arguments also work as one might expect:
>>> multiply = lambda x, y: x * y >>> times_3 = Invoke(multiply).constants(y=3).specs(x='value') >>> glom({'value': 5}, times_3) 15
specs()
and otherInvoke
methods may be called multiple times, just remember that every call returns a new spec.
- star(args=None, kwargs=None)[source]¶
Returns a new
Invoke
spec, with args and/or kwargs specs set to be “starred” or “star-starred” (respectively)>>> spec = Invoke(zip).star(args='lists') >>> target = {'lists': [[1, 2], [3, 4], [5, 6]]} >>> list(glom(target, spec)) [(1, 3, 5), (2, 4, 6)]
- Parameters
args (spec) – A spec to be evaluated and “starred” into the underlying function.
kwargs (spec) – A spec to be evaluated and “star-starred” into the underlying function.
One or both of the above arguments should be set.
The
star()
, like otherInvoke
methods, may be called multiple times. The args and kwargs will be stacked in the order in which they are provided.
Alternative approach to functions: Call¶
An earlier, more primitive approach to callables in glom was the Call specifier type.
Warning
Given superiority of its successor, Invoke
,
the Call
type may be deprecated in a future release.
- class glom.Call(func=None, args=None, kwargs=None)[source]¶
Call
specifies when a target should be passed to a function, func.Call
is similar topartial()
in that it is no more powerful thanlambda
or other functions, but it is designed to be more readable, with a betterrepr
.- Parameters
func (callable) – a function or other callable to be called with the target
Call
combines well withT
to construct objects. For instance, to generate a dict and then pass it to a constructor:>>> class ExampleClass(object): ... def __init__(self, attr): ... self.attr = attr ... >>> target = {'attr': 3.14} >>> glom(target, Call(ExampleClass, kwargs=T)).attr 3.14
This does the same as
glom(target, lambda target: ExampleClass(**target))
, but it’s easy to see which one reads better.Note
Call
is mostly for functions. Use aT
object if you need to call a method.
Self-Referential Specs¶
Sometimes nested data repeats itself, either recursive structure or just through redundancy.
- class glom.Ref(name, subspec=Sentinel('_MISSING'))[source]¶
Name a part of a spec and refer to it elsewhere in the same spec, useful for trees and other self-similar data structures.
- Parameters
name (str) – The name of the spec to reference.
subspec – Pass a spec to name it name, or leave unset to refer to an already-named spec.
The glom
Scope¶
Sometimes data transformation involves more than a single target and spec. For those times, glom has a scope system designed to manage additional state.
Basic usage¶
On its surface, the glom scope is a dictionary of extra values that
can be passed in to the top-level glom call. These values can then be
addressed with the S object, which behaves
similarly to the T
object.
Here’s an example case, counting the occurrences of a value in the target, using the scope:
>>> count_spec = T.count(S.search)
>>> glom(['a', 'c', 'a', 'b'], count_spec, scope={'search': 'a'})
2
Note how S supports attribute-style dot-access for its keys. For keys which are not valid attribute names, key-style access is also supported.
Note
glom itself uses certain keys in the scope to manage internal state. Consider the namespace of strings, integers, builtin types, and other common Python objects open for your usage. Read the custom spec doc to learn about more advanced, reserved cases.
Updating the scope - S()
& A
¶
glom’s scope isn’t only set once when the top-level glom()
function is called. It’s dynamic and updatable.
If your use case requires saving a value from one part of the target for usage elsewhere, then S will allow you to save values to the scope:
>>> target = {'data': {'val': 9}}
>>> spec = (S(value=T['data']['val']), {'val': S['value']})
>>> glom(target, spec)
{'val': 9}
Any keyword arguments to the S will have their values evaluated as a spec, with the result being saved to the keyword argument name in the scope.
When only the target is being assigned, you can use the A as a shortcut:
>>> target = {'data': {'val': 9}}
>>> spec = ('data.val', A.value, {'val': S.value})
>>> glom(target, spec)
{'val': 9}
A enables a shorthand which assigns the current target to a location in the scope.
Sensible saving - Vars
& S.globals
¶
Of course, glom’s scopes do not last forever. Much like function calls in Python, new child scopes can see and read values in parent scopes. When a child spec saves a new value to the scope, it’s lost when the child spec completes.
If you need values to be saved beyond a spec’s local scope, the best
way to do that is to create a Vars
object in a common
ancestor scope. Vars
acts as a mutable namespace where
child scopes can store state and have it persist beyond their local
scope. Choose a location in the spec such that all involved child
scopes can see and share the value.
Note
glom precreates a global
Vars
object atS.globals
. Any values saved there will be accessible throughout that givenglom()
call:>>> last_spec = ([A.globals.last], S.globals.last) >>> glom([3, 1, 4, 1, 5], last_spec) 5While not shared across calls, most of the same care prescribed about using global state still applies.
- class glom.Vars(base=(), **kw)[source]¶
Vars
is a helper that can be used with S in order to store shared mutable state.Takes the same arguments as
dict()
.Arguments here should be thought of the same way as default arguments to a function. Each time the spec is evaluated, the same arguments will be referenced; so, think carefully about mutable data structures.
Core Exceptions¶
Not all data is going to match specifications. Luckily, glom errors are designed to be as readable and actionable as possible.
All glom exceptions inherit from GlomError
, described below,
along with other core exception types. For more details about handling
and debugging exceptions, see “Exceptions & Debugging”.
- class glom.PathAccessError(exc, path, part_idx)[source]¶
This
GlomError
subtype represents a failure to access an attribute as dictated by the spec. The most commonly-seen error when using glom, it maintains a copy of the original exception and produces a readable error message for easy debugging.If you see this error, you may want to:
Check the target data is accurate using
Inspect
Catch the exception and return a semantically meaningful error message
Use
glom.Coalesce
to specify a defaultUse the top-level
default
kwarg onglom()
In any case, be glad you got this error and not the one it was wrapping!
- Parameters
exc (Exception) – The error that arose when we tried to access path. Typically an instance of KeyError, AttributeError, IndexError, or TypeError, and sometimes others.
path (Path) – The full Path glom was in the middle of accessing when the error occurred.
part_idx (int) – The index of the part of the path that caused the error.
>>> target = {'a': {'b': None}} >>> glom(target, 'a.b.c') Traceback (most recent call last): ... PathAccessError: could not access 'c', part 2 of Path('a', 'b', 'c'), got error: ...
- class glom.CoalesceError(coal_obj, skipped, path)[source]¶
This
GlomError
subtype is raised from within aCoalesce
spec’s processing, when none of the subspecs match and no default is provided.The exception object itself keeps track of several values which may be useful for processing:
- Parameters
>>> target = {} >>> glom(target, Coalesce('a', 'b')) Traceback (most recent call last): ... CoalesceError: no valid values found. Tried ('a', 'b') and got (PathAccessError, PathAccessError) ...
Note
Coalesce is a branching specifier type, so as of v20.7.0, its exception messages feature an error tree. See Reading Branched Exceptions for details on how to interpret these exceptions.
- class glom.UnregisteredTarget(op, target_type, type_map, path)[source]¶
This
GlomError
subtype is raised when a spec calls for an unsupported action on a target type. For instance, trying to iterate on an non-iterable target:>>> glom(object(), ['a.b.c']) Traceback (most recent call last): ... UnregisteredTarget: target type 'object' not registered for 'iterate', expected one of registered types: (...)
It should be noted that this is a pretty uncommon occurrence in production glom usage. See the Setup and Registration section for details on how to avoid this error.
An UnregisteredTarget takes and tracks a few values:
Setup and Registration¶
When it comes to targets, glom()
will operate on the
vast majority of objects out there in Python-land. However, for that
very special remainder, glom is readily extensible!
- glom.register(target_type, **kwargs)[source]¶
Register target_type so
glom()
will know how to handle instances of that type as targets.Here’s an example of adding basic iterabile support for Django’s ORM:
import glom import django.db.models glom.register(django.db.models.Manager, iterate=lambda m: m.all()) glom.register(django.db.models.QuerySet, iterate=lambda qs: qs.all())
- Parameters
target_type (type) – A type expected to appear in a glom() call target
get (callable) – A function which takes a target object and a name, acting as a default accessor. Defaults to
getattr()
.iterate (callable) – A function which takes a target object and returns an iterator. Defaults to
iter()
if target_type appears to be iterable.exact (bool) – Whether or not to match instances of subtypes of target_type.
Note
The module-level
register()
function affects the module-levelglom()
function’s behavior. If this global effect is undesirable for your application, or you’re implementing a library, consider instantiating aGlommer
instance, and using theregister()
andGlommer.glom()
methods instead.
- glom.register_op(op_name, **kwargs)[source]¶
For extension authors needing to add operations beyond the builtin ‘get’, ‘iterate’, ‘keys’, ‘assign’, and ‘delete’ to the default scope. See TargetRegistry for more details.
- class glom.Glommer(**kwargs)[source]¶
The
Glommer
type mostly serves to encapsulate type registration context so that advanced uses of glom don’t need to worry about stepping on each other.Glommer objects are lightweight and, once instantiated, provide a
glom()
method:>>> glommer = Glommer() >>> glommer.glom({}, 'a.b.c', default='d') 'd' >>> Glommer().glom({'vals': list(range(3))}, ('vals', len)) 3
Instances also provide
register()
method for localized control over type handling.