Google Open Source Blog

Google and NIST partner on nanotechnology development platform

Tuesday, September 13, 2022

We’re proud to announce Google’s cooperative research and development agreement with the U.S. National Institute of Standards and Technology (NIST) to develop an open source testbed for nanotechnology research and development for American universities. NIST—a bureau of the U.S. Department of Commerce—will start by migrating their existing planarized wafer designs to an open source framework, which can be manufactured in the U.S. on SkyWater Technologies’ open source 130nm process (SKY130). The physical wafers and source code will be available in the coming months. Together, NIST, Google, and the open source community will develop designs to facilitate research into both basic and applied science, including technology transfer into production with U.S. manufacturers.

Furthering Google’s goals to improve access to semiconductor technology, this agreement will provide academic researchers with unprecedented resources from a semiconductor foundry to enhance research into the physics of semiconductors and nanodevices. This includes their chemistry, defects, electrical properties, high frequency operation, and switching behavior, while reducing overall costs through economies of scale. Most importantly, this access enhances the technology transfer process by enabling researchers to develop new and emerging technologies using foundry resources, that can then be seamlessly transitioned into mass production since universities will already be using an industrially relevant platform. This will greatly improve scientist’s ability to move their technologies through the tech-transfer “valley of death” and into practical use.

Nanotechnology research has benefitted from silicon wafers that are normally used for chip manufacturing in a unique way. Instead of turning them into packaged microchips, their smooth, planarized surface makes a great substrate for building and testing nanoscale structures. This likewise helps test their transition into mass production.

Picture of a full wafer using the SKY130 open source PDK.

The wafer for this platform has a number of different metrology structures, from parametric test structures based on simple transistor arrays—which can be probed in a probe station—to thousands of complex measurements that users can operate using synthesized digital circuits. Critically, the wafers will be available to universities in a 200 mm form factor, and mid-production planarized wafers with less than a single nanometer of surface roughness. Smooth, flat surfaces are critical for advanced manufacturing at small sizes.

NIST researchers are also ensuring that the wafers have photolithographic and electron beam alignment marks commonly found in university nanofabrication facilities, allowing the foundry silicon to be used directly by university researchers with ease. Metal pads on the surface will allow scientists to access the semiconductor transistors from the surface.

NIST scientists anticipate the nanotechnology accelerator platform will enhance scientific investigations into a diverse set of technologies, including memory devices (resistive switches, magnetic tunnel junctions, flash memories), artificial intelligence, plasmonics, semiconductor bioelectronics, thin film transistors and even quantum information science.

Picture of a development die from Google 's OpenMPW program for the nanotechnology accelerator developed by NIST and the University of Michigan

This program also benefits from Google’s previous contributions and support of the GDSFactory and OpenFASOC open source projects that help automate and shorten the construction of these important measuring devices from months to days. Ahead of the full wafer tapeout in 2023, NIST scientists, working with partners at the University of Michigan, Carnegie Mellon, University of Maryland, The George Washington University, and Brown University have been using Google's OpenMPW program to develop and test preliminary circuits which they expect to include in the nanotechnology accelerator. Preliminary testing will help ensure the program’s goals are met with working circuits that best serve the scientific community.

A key factor in cutting-edge research is reproducibility, or the ability for researchers from different institutions to repeat each other’s experiments and improve upon them. By migrating to an open source framework, researchers can more easily share reproducible results, contribute to the creation of open source datasets to enhance future simulation, and advance the scientific community’s state of the art of nanotechnology and semiconductor manufacturing.

NIST and Google will distribute the first production run of wafers to leading U.S. universities. Post-program, American scientists will be able to directly purchase the wafers from Skywater without license requirements, giving them the freedom to pursue their research without any restrictions. Since wafers are hundreds of times cheaper than full mask-sets or the cost of designing integrated circuits from scratch, scientists will have a much easier time getting and using this powerful industrial technology. Longer term, working with NIST to develop future platforms on the recently announced SKY90FD open source PDK will further expand this R&D ecosystem.

To kick off this research effort NIST is organizing the "NIST Integrated Circuits for Metrology Workshop" from September 20–21, 2022. This workshop will be held online with a series of presentations and panel discussions on the first day. During the second day, a working group of researchers, scientists and engineers will work to focus on the creation of parametric test structures for monolithic integration using open source silicon technology. Visit the event website to get more details about this program and register to attend or learn more about presenting.

By Ethan Mahintorabi, Software Engineer and Johan Euphrosine, Developer Programs Engineer – Hardware Toolchains Team, and Aaron Cunningham, Technical Program Manager – Google Open Source Programs Office

Accelerate your models to production with Google Cloud and PyTorch

Monday, September 12, 2022

We believe in the power of choice for Machine Learning development, and continue to invest resources to make it easy for ML practitioners to train, deploy, and orchestrate models from a single unified data and AI cloud platform. We’re excited to announce our role as a founding member of the newly formed PyTorch Foundation, which will better position Google Cloud to make meaningful contributions to the PyTorch community. As a member of the board, we will deepen our open source investment to deliver on the Foundation’s mission to drive adoption of AI tooling by building an ecosystem of open source projects with PyTorch. We strongly believe in choice and will continue to invest in frameworks such as JAX and Tensorflow and support integrations with other OSS Projects including Spark, Airflow, XGBoost, and others.

In this blog, we provide an overview of existing resources to help you get started with PyTorch on Google Cloud. We also talk about how ML practitioners can leverage our end-to-end ML platform to train, tune, and deploy PyTorch models.

PyTorch on Google Cloud

Open source in the cloud is important because it gives you flexibility and control over where you train and deploy your ML workloads. PyTorch is extensively used in the research space and in recent years it has gained immense traction in the industry due to its ease of use and deployment. In fact, according to a survey of Kaggle users, PyTorch is the fastest growing ML framework today.

ML practitioners using PyTorch tell us that it can be challenging to advance their ML project past experimentation. This is why Google Cloud has built integrations with PyTorch that make it easier to train, deploy, and orchestrate models in production. Some examples are:

PyTorch integrates directly with Vertex AI, a fully managed ML platform that provides the tools you need to take a model from PyTorch to production, like the Pytorch DL containers or the Vertex AI workbench PyTorch one-click JupyterLab environment.
PyTorch/XLA, an open source library, uses the XLA deep learning compiler to enable PyTorch to run on Cloud TPUs. Cloud TPUs are custom accelerators designed by Google, optimized for perf/TCO with large scale ML workload PyTorch/XLA also enables XLA driven optimizations on GPUs.
TorchX provides an adapter to run and orchestrate TorchX components as part of Kubeflow Pipelines that you can easily scale on Vertex AI Pipelines.
With our OSS contributions to Apache Beam, we have made PyTorch models easy to deploy in batch or stream, data processing pipelines. Running on Google Dataflow, these pipelines will scale to very large workloads in a fully managed and simple to maintain environment.

To learn more and start using PyTorch on Google Cloud, check out the resources below:

PyTorch on Vertex AI Resources

How To train and tune PyTorch models on Vertex AI: Learn how to use Vertex AI Training to build and train a sentiment text classification model using PyTorch and Vertex AI Hyperparameter Tuning to tune hyperparameters of PyTorch models.
How to deploy PyTorch models on Vertex AI: Walk through the deployment of a Pytorch model using TorchServe as a custom container, by deploying the model artifacts to a Vertex Prediction service.
Orchestrating PyTorch ML Workflows on Vertex AI Pipelines: See how to build and orchestrate ML pipelines for training and deploying PyTorch models on Google Cloud Vertex AI using Vertex AI Pipelines.
Scalable ML Workflows using PyTorch on Kubeflow Pipelines and Vertex Pipelines: Take a look at examples of PyTorch-based ML workflows on two pipelines frameworks: OSS Kubeflow Pipelines, part of the Kubeflow project, and Vertex AI Pipelines. We share new PyTorch built-in components added to the Kubeflow Pipelines.

PyTorch/XLA and Cloud TPU/GPU

Scaling deep learning workloads with PyTorch / XLA and Cloud TPU VM: Describes the challenges associated with scaling deep learning jobs to distributed training settings, using the Cloud TPU VM and shows how to stream training data from Google Cloud Storage (GCS) to PyTorch / XLA models running on Cloud TPU Pod slices.
PyTorch/XLA: Performance debugging on Cloud TPU VM: Part I: In the first part of the performance debugging series on Cloud TPU, we lay out the conceptual framework for PyTorch/XLA in the context of training performance. We introduced a case study to make sense of preliminary profiler logs and identify the corrective actions.
PyTorch/XLA: Performance debugging on Cloud TPU VM: Part II: In the second part, we deep dive into further analysis of the performance debugging to discover more performance improvement opportunities.
PyTorch/XLA: Performance debugging on Cloud TPU VM: Part III: In the final part of the performance debugging series, we introduce user defined code annotation and visualize these annotations in the form of a trace.
Train ML models with Pytorch Lightning on TPUs: Learn how easy it is to start training models with PyTorch Lightning on TPUs with its built-in TPU support.

PyTorch on Apache Beam and Google Cloud Dataflow

Integrating ML models into production pipelines with Dataflow: Learn how to use Apache Beam's RunInference transform, with either single or multi model pipelines at scale.

Other resources

Increase your productivity using PyTorch Lightning: Learn how to use PyTorch Lightning on Vertex AI Workbench (was previously Notebooks).

By Erwin Huizing and Grace Reed – Cloud AI and ML

TestParameterInjector gets JUnit5 support

Wednesday, September 7, 2022

In March 2021, we announced the open source release of TestParameterInjector: A parameterized test runner for JUnit4 (see GitHub page).

Over a year later, the Google-internal usage of TestParameterInjector has continued to rapidly grow, and is now by far the most popular parameterized test framework.

Graph of the different parameterized test frameworks in Google

Guava's philosophy frames it nicely: "When trying to estimate the ubiquity of a feature, we frequently use the Google internal code base as a reference." We also believe that TestParameterInjector usage in Google is a decent proxy for its utility elsewhere.

As you can see on the graph above, not only did TestParameterInjector reduce the usage of the other frameworks, but it also caused a drastic increase of the total amount of parameterized tests. This suggests that TestParameterInjector reduced the threshold for parameterizing a regular unit test and Googlers are more actively using this tool to improve the quality of their tests.

JUnit5 (Jupiter) support

At Google, we use JUnit4 exclusively, but some developers outside of Google have moved on to JUnit5 (Jupiter). For those users, we have now expanded the scope of TestParameterInjector.

We've kept the API the same as much as possible:

// **************** JUnit4 **************** //

@RunWith(TestParameterInjector.class)

public class MyTest {

@TestParameter boolean isDryRun;

@Test public void test1(@TestParameter boolean enableFlag) { ... }

@Test public void test2(@TestParameter MyEnum myEnum) { ... }

enum MyEnum { VALUE_A, VALUE_B, VALUE_C }

}

// **************** JUnit5 (Jupiter) **************** //

class MyTest {

@TestParameter boolean isDryRun;

@TestParameterInjectorTest

void test1(@TestParameter boolean enableFlag) {

// This method is run 4 times for all combinations of isDryRun and enableFlag

}

@TestParameterInjectorTest

void test2(@TestParameter MyEnum myEnum) {

// This method is run 6 times for all combinations of isDryRun and myEnum

}

enum MyEnum { VALUE_A, VALUE_B, VALUE_C }

}

The only differences are that @RunWith / @ExtendWith are not necessary and that every test method needs a @TestParameterInjectorTest annotation.

The other features of TestParameterInjector work in a similar way with Jupiter:

class MyTest {

// **************** Defining sets of parameters **************** //

@TestParameterInjectorTest

@TestParameters(customName = "teenager", value = "{age: 17, expectIsAdult: false}")

@TestParameters(customName = "young adult", value = "{age: 22, expectIsAdult: true}")

void personIsAdult_success(int age, boolean expectIsAdult) {

assertThat(personIsAdult(age)).isEqualTo(expectIsAdult);

}

// **************** Dynamic parameter generation **************** //

@TestParameterInjectorTest

void matchesAllOf_throwsOnNull(

@TestParameter(valuesProvider = CharMatcherProvider.class) CharMatcher charMatcher) {

assertThrows(NullPointerException.class, () -> charMatcher.matchesAllOf(null));

}

private static final class CharMatcherProvider implements TestParameterValuesProvider {

@Override

public List<CharMatcher> provideValues() {

return ImmutableList.of(

CharMatcher.any(), CharMatcher.ascii(), CharMatcher.whitespace());

}

Other things we've been working on

Custom names for @TestParameters
When running the following parameterized test:

@Test

@TestParameters("{age: 17, expectIsAdult: false}")

@TestParameters("{age: 22, expectIsAdult: true}")

public void withRepeatedAnnotation(int age, boolean expectIsAdult){ ... }

the generated test names will be:

MyTest#withRepeatedAnnotation[{age: 17, expectIsAdult: false}]

MyTest#withRepeatedAnnotation[{age: 22, expectIsAdult: true}]

This is fine for small parameter sets, but when the number of @TestParameters or parameters within the YAML string gets large, it quickly becomes hard to figure out what each parameter set is supposed to represent.

For those cases, we added the option to add customName:

@Test

@TestParameters(customName = "teenager", value = "{age: 17, expectIsAdult: false}")

@TestParameters(customName = "young adult", value = "{age: 22, expectIsAdult: true}")

public void personIsAdult(int age, boolean expectIsAdult){...}

To allow this API change, we had to allow @TestParameters to be used in a different way: The original way of specifying @TestParameters sets was to specify them as a list of YAML strings inside a single @TestParameters annotation. We considered multiple options of specifying the custom name inside of these YAML strings, such as a magic _name key and an extra YAML mapping layer where the keys would be the test names. But we eventually settled on the aforementioned API, which makes @TestParameters a repeated annotation because it results in the least complex code and clearly separates the different parameter sets.

It should be noted that the original API (list of YAML strings in single annotation) still works, but it is now discouraged in favor of multiple @TestParameters annotations with a single YAML string, even when customName isn't used. The main arguments for this recommendation are:

Consistency with the customName case, which needs a single YAML string per @TestParameters annotation
We believe it structures the list of parameters (especially when it's long) in a more structured way

Integration with RobolectricTestRunner

Recently, we've managed to internally make a version of RobolectricTestRunner that supports TestParameterInjector annotations. There is a significant amount of work left to open source this, and we are now considering when and how to do this.

Learn more

Our GitHub README provides an overview of the framework. Let us know on GitHub if you have any questions, comments, or feature requests!

By Jens Nyman – TestParameterInjector