TensorFlow use of Google Technologies

Monday March 13, 2017

TensorFlow is a large project. It has a lot of unique features, and as a Google project, it also connects with other Google projects: gflags, apputils, Bazel, protobuf, gRPC, gemmlowp, StreamExecutor, GFile, and even XLA. To better understand and use TensorFlow, it helps to know what these pieces are and how they fit together.


In TensorFlow code, you may see, for example, tf.app.flags.FLAGS. This is part of a TensorFlow implementation for command-line argument parsing. It happens automatically when using tf.app.run(). Internally it uses Python's standard argparse, but the API is inspired by Google's gflags.

Formerly Google Commandline Flags, gflags is a Google system for building standard command-line interfaces. There's a C++ implementation, which has a documentation page reminiscent of a command-line interface. The Python gflags is documented primarily via a collection of examples.

Python users may be more familiar with the standard argparse API, or perhaps even the older optparse, which both achieve functionality similar to that of gflags.

While TensorFlow doesn't include a full gflags implementation, the TensorFlow version appears in pretty many code examples, where it imparts a little bit of extra Google flavor to argument handling for TensorFlow scripts. You can use it (though there are reasons not to) or opt for a separate and possibly more full-featured package, whether it be the full gflags, argparse, or something stranger.


The TensorFlow tf.app.run() invocation doesn't actually use Google's Python apputils, but like with gflags, TensorFlow is mimicking some of the behavior of Google tooling.

The open source google-apputils for Python hasn't been updated since March 2015, but it's still pretty interesting, with a lot of functionality that could still be relevant.


Bazel is a build tool like make. Bazel is open source; originally there was Google's internal "Blaze" tool.

You might encounter Bazel if you're building TensorFlow from source, for example.



Protocol buffers (often protobuf, or just proto) "are Google's [open source] language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler."

When you look at protocol buffers in their text representations, they look something like JSON.

You likely won't encounter protocol buffers directly when using TensorFlow. They're used under the hood all over the place: for example, as the format for TensorBoard summaries, and to help make data transfer efficient in distributed settings.


gRPC is Google's open source framework for remote procedure call (RPC) systems. It uses protocol buffers.

If RPC is unfamiliar, there is some comparison to RESTful communications for web services.

You might encounter gRPC if you build a client for a TensorFlow serving server, for example.



gemmlowp is Google's open source low-precision matrix multiplication library.

You won't likely encounter gemmlowp directly, but you'll benefit from it if you're using TensorFlow for low-precision implementations.


NVIDIA currently dominates the GPU market, and so their CUDA language for programming GPUs is the dominant GPU language. But other vendors would like to sell GPUs too, and they would like everybody to standardize on OpenCL for programming them. Programmers would rather just program once for whatever platform; they would like to have general-purpose computing on graphics processing units (GPGPU).

StreamExecutor is Google's GPGPU system. They've talked about open-sourcing it directly as part of the LLVM project, but as far as I know it still exists in open source only inside TensorFlow.

OpenCL support is not necessarily perfect, but you may be able to try it out if you build from source. It's issue #22 for the TensorFlow GitHub project.

StreamExecutor is another implementation technology that you aren't likely to encounter directly as a user of TensorFlow.


Google has C++ file I/O code that avoids thread locking, and this is included as part TensorFlow.

This is another implementation technology that should benefit TensorFlow users silently within the system.


The Accelerated Linear Algebra (XLA) system is described as the TensorFlow compiler framework, but there's more to it than that.

For one thing, the aspects of XLA that appear in open source TensorFlow aren't all the XLA that exists; Google uses their own hardware, TPUs, and both that hardware and the XLA components to target it are internal to Google.

For another, XLA has been described as "designed for reuse", and other projects could incorporate the technology as well.

XLA is still experimental, and largely deep inside TensorFlow's implementation. Users might try turning on XLA to see whether it helps speed up their computations.

Thanks to Googlers Julia Ferraioli, Pete Warden, and Martin Wicke, for a brief twitter exchange that informed this list, and to Derek Murray for correcting my misconception about gflags in TensorFlow.

I'm working on Building TensorFlow systems from components, a workshop at OSCON 2017.