24 Python libraries for every Python developer – infoworld.com
By Serdar Yegulalp
Senior Writer, InfoWorld |
Want a good reason for the smashing success of the Python programming language? Look no further than the massive collection of libraries available for Python, both native and third-party libraries. With so many Python libraries out there, though, it’s no surprise that some don’t get all the attention they deserve. Plus, programmers who work exclusively in one domain don’t always know about the goodies available to them for other kinds of work.
Here are 24 Python libraries you may have overlooked but are definitely worth your attention. These gems run the gamut of usefulness, simplifying everything from file system access, database programming, and working with cloud services to building lightweight web apps, creating GUIs, and working with images, ebooks, and Word files—and much more besides. Some are well-known, others lesser-known, but all of these Python libraries deserve a place in your toolbox.
What Libcloud does: Access multiple cloud providers through a single, consistent, unified API.
Why use Libcloud: If the above description of Apache Libcloud doesn’t make you clap your hands for joy, then you haven’t tried working with multiple clouds. Cloud providers all love to do things their way, making a unified mechanism for dealing with dozens of providers a huge timesaver and headache-soother. APIs are available for compute, storage, load balancing, and DNS, with support for Python 2.x and Python 3.x as well as PyPy, the performance-boosting JIT compiler for Python.
What Arrow does: Cleaner handling of dates and times in Python.
Why use Arrow: Dealing with time zones, date conversions, date formats, and all the rest is already a headache and a half. Throw in Python’s standard library for date/time work, and you get two headaches and a half.
Arrow provides four big advantages. One, Arrow is a drop-in replacement for Python’s datetime module, meaning that common function calls like .now()
and .utcnow()
work as expected. Two, Arrow provides methods for common needs like shifting and converting time zones. Three, Arrow provides “humanized” date/time information—such as being able to say something happened “an hour ago” or will happen “in two hours” without much effort. Four, Arrow can localize date/time information without breaking a sweat.
What Behold does: Robust support for print-style debugging in Python.
Why use Behold: There is one simple way to debug in Python, or almost any programming language for that matter: Insert in-line print
statements. But while print-debugging is a no-brainer in small programs, it’s not so easy to get useful results within large, sprawling, multi-module projects.
Behold provides a toolkit for contextual debugging via print statements. It allows you to impose a uniform look on the output, tag the results so they can be sorted via searches or filters, and provide contexts across modules so that functions that originate in one module can be debugged properly in another. Behold handles many common Python-specific scenarios like printing an object’s internal dictionary, unveiling nested attributes, and storing and reusing results for comparison at other points during the debugging process.
What Black does: Formats Python code according to a strict and almost totally immutable set of rules.
Why use Black: Python code formatters, like YAPF, tend to have many configurable options—line length, line-splitting options, handling of trailing commas, and so on. Black applies a consistent set of defaults for those rules that cannot be altered. The resulting formatted code is as consistent as possible across code bases and between users, with the fewest possible differences between edited files.
Black takes some getting used to, especially if you’re finicky about vertical whitespace, statements with deep nestings (e.g., lists within lists), and other formatting options. But in the long run it frees you from having to think about formatting, letting you concentrate on your code.
What Bottle does: Lightweight and fast web apps.
Why use Bottle: When you want to throw together a quick RESTful API or use the bare bones of a web framework to build an app, capable yet tiny Bottle gives you no more than you need. Routing, templates, access to request and response data, support for multiple server types from plain old CGI on up, and support for more advanced features like WebSockets—it’s all here. The amount of work needed to get started is likewise minimal, and Bottle’s design is elegantly extensible when more advanced functions are needed.
What Click does: Lets you quickly build command-line interfaces for Python apps.
Why use Click: GUIs are convenient, but CLIs are where the real power is. However, building a robust CLI is hardly easy, and the default toolset for gathering and using command-line options in Python is primitive.
Click wraps those bits and pieces in a high-level, CLI-construction API. If you just want to create a few basic commands, you can do that with a couple of lines of code. If you want more advanced behavior, like prompting separately for more information about a parameter, or deriving values from environment variables, Click has you covered. Click also supports terminal colors via the colorama
library, and can be expanded with third-party plug-ins.
What EbookLib does: Read and write .epub files.
Why use EbookLib: Creating ebooks typically requires wrangling one command-line tool or another. EbookLib provides management tools and APIs that simplify the process. It works with EPUB 2 and EPUB 3 files, with Kindle support under development.
Provide the images and the text (the latter in HTML format), and EbookLib can assemble those pieces into an ebook complete with chapters, nested table of contents, images, HTML markup, and so on. Cover, spine, and stylesheet data are all supported, too. A plug-in system allows third parties to extend the library’s behaviors.
If you don’t need everything EbookLib has to offer, try Mkepub. Mkepub packs basic ebook assembly functionality in a library that is only a few kilobytes in size. One minor drawback of Mkepub is that it requires Jinja2, which in turn requires the MarkupSafe library.
What Gooey does: Give a console-based Python program a platform-native GUI.
Why use Gooey: Presenting users, especially rank-and-file users, with a command-line interface is among the best ways to discourage use of your application. Few apart from the hardcore geek like figuring out what options to pass in and in what order. Gooey takes arguments expected by the argparse library and presents them to users as a GUI form, by way of the WxPython library. All options are labeled and displayed with appropriate controls (such as a drop-down for a multi-option argument). Very little additional coding—a single include and a single decorator—is needed to make it work, assuming you’re already using argparse.
What Invoke does: Pythonic remote execution – i.e., perform admin tasks using a Python library.
Why use Invoke: Using Python as a replacement for common shell scripting tasks makes a world of sense. Invoke provides a high-level API for running shell commands and managing command-line tasks as if they were Python functions, allowing you to embed those tasks in your own code or elegantly build around them. Just be careful not to allow untrusted input to be passed as-is to any shell commands.
What Nuitka does: Compile Python into self-contained C executables.
Why use Nuitka: Like Cython, Nuitka compiles Python into C. However, whereas Cython requires its own custom syntax for best results, and focuses mainly on math and statistics applications, Nuitka works with any Python program as-is, compiles it into C, and produces a single-file executable, applying optimizations where it can along the way. Nuitka is still in its early stages, and many of the planned optimizations are still to come. Nevertheless, it’s a convenient way to turn a Python script into a speedy command-line app.
What Numba does: Selectively speed up math-intensive functions.
Why use Numba: The Python world includes a whole subculture of packages for accelerating math operations. For example, NumPy works by wrapping high-speed C libraries in a Python interface, and Cython compiles Python to C with optional typing for accelerated performance. But Numba is easily the most convenient, as it allows Python functions to be selectively accelerated with nothing more than a decorator. For further speed boosts, you can use common Python idioms to parallelize workloads, or use SIMD or GPU instructions.
Note that you can use NumPy with Numba. After all, NumPy has many out-of-the-box algorithms that don’t need to be implemented from scratch. But for small “kernel” algorithms, Numba will in many cases outperform NumPy many times over.
What Openpyxl does: Reads, writes, and manipulates Excel files.
Why use OpenPyxl: Ask someone to name three tools that number crunchers use in their work, odds are you’ll get Python, R, and Excel, not necessarily in that order. Excel doesn’t (yet) have native Python connectivity, but third-party packages have bridged the gap in various ways.
Openpyxl works by modifying Excel files rather than by manipulating Excel directly. With Openpyxl, you can automate the creation of spreadsheets and workbooks, generate formulas, populate cells with those formulas, and perform many other operations. You can also change the properties of Excel objects, such as cell styles and conditional formatting. Anyone who spends significant time staring at spreadsheets will find something useful here.
What Peewee does: A tiny ORM (object-relational mapper) that supports SQLite, MySQL, and PostgreSQL, with many extensions.
Why use Peewee: Not everyone loves an ORM; some would rather leave schema modeling on the database side and be done with it. But for developers who don’t want to touch databases, a well-constructed, unobtrusive ORM can be a godsend. And for developers who don’t want an ORM as full-blown as SQL Alchemy, Peewee is a great fit.
Peewee models are easy to construct, connect, and manipulate. Plus, many common query-manipulation functions, such as pagination, are built right in. More features are available as add-ons including extensions for other databases, testing tools, and a schema migration system—a feature even an ORM hater could learn to love. Note that the Peewee 3.x branch (the recommended edition) is not completely backward-compatible with previous versions of Peewee.
What Pillow does: Image processing without the pain.
Why use Pillow: Most Pythonistas who have performed image processing ought to be familiar with PIL (Python Imaging Library), but PIL is riddled with shortcomings and limitations, and it’s updated infrequently. Pillow aims to be both easier to use and code-compatible with PIL via minimal changes. Extensions are included for talking to both native Windows imaging functions and Python’s Tcl/Tk-backed Tkinter GUI package. Pillow is available through GitHub or the PyPI repository.
What Poetry does: Manages dependencies and packaging for your Python projects in a high-level way.
Why use Poetry: In theory you don’t have to do anything to start a new Python project except create an empty directory and fill it with .py files. In practice, especially for an ambitious project, you will need to do much more — create a README, set up some folder structure, declare your dependencies, and so on. Doing all of this by hand is a headache.
Poetry automates much of this setup and maintenance. Run poetry new
to create a new project directory and virtual environment, pre-populated with a basic assortment of components. Declare your dependencies using Python’s own pyprojec.toml file format, and Poetry will manage them for you. Existing Poetry-managed products can have their dependencies automatically installed, refreshed, and modified from Poetry’s command line. Poetry also handles publishing to a remote repository (like PyPI).
What PyFilesystem does: A Pythonic interface to any file system — any file system.
Why use PyFilesystem: The fundamental idea behind PyFilesystem couldn’t be simpler: Just as Python’s file
objects abstract a single file, PyFilesystem’s FS
objects abstract an entire file system. This doesn’t mean only on-disk file systems, either. PyFilesystem also supports FTP directories, in-memory file systems, file systems for locations defined by the OS (such as the user directory), and even combinations of the above overlaid onto each other.
In addition to making it easier to write cross-platform code that manipulates files, PyFilesystem obviates the need to cobble together scripts from disparate parts of the standard library, mainly os
and io
. It also provides utilities that one might otherwise need to create from scratch, like a tool for printing console-friendly tree views of a file system.
What Pygame does: Create video games, or game-quality front-ends, in Python.
Copyright © 2024 IDG Communications, Inc.
Recent Comments