8 Recent Google AI Publications On Machine Learning: Machine learning is a powerful tool for gleaning knowledge from massive amounts of data.
While a great deal of machine learning research has focused on improving the accuracy and efficiency of training and inference algorithms, there is less attention in the equally important problem of monitoring the quality of data fed to machine learning.

Highlight:
  1. Data Validation for Machine Learning
  2. 3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning
  3. Automatically Charting Symptoms From Patient-Physician Conversations Using Machine Learning
  4. Identifying and Correcting Label Bias in Machine Learning
  5. Identifying the intersections: User experience + research scientist collaboration in a generative machine learning interface
  6. How to Develop Machine Learning Models for Healthcare
  7. TensorFlow.js: Bringing Machine Learning to the Web and Beyond
  8. Toward Exploring End-to-End Learning Algorithms for Autonomous Aerial Machines
Data Validation for Machine Learning

Recent Google AI Publications On Machine Learning
The importance of this problem is hard to dispute: errors in the input data can nullify any benefits on speed and accuracy for training and inference. This argument points to a data-centric approach to machine learning that treats training and serving data as an important production asset, on par with the algorithm and infrastructure used for learning.

In this paper, Google AI tackle this problem and present a data validation system that is designed to detect anomalies specifically in data fed into machine learning pipelines. This system is deployed in production as an integral part of TFX\cite{Baylor:2017:TTP:3097983.3098021} -- an end-to-end machine learning platform at Google. It is used by hundreds of product teams use it to continuously monitor and validate several petabytes of production data per day. We faced several challenges in developing our system, most notably around the ability of ML pipelines to soldier on in the face of unexpected patterns, schema-free data, or training/serving skew. 

We discuss these challenges, the techniques we used to address them, and the various design choices that we made in implementing the system. Finally, we present evidence from the system's deployment in production that illustrate the tangible benefits of data validation in the context of ML: early detection of errors, model-quality wins from using better data, savings in engineering hours to debug problems, and a shift towards data-centric workflows in model development.

3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning

3LC is a lossy compression scheme for state change traffic in distributed machine learning (ML) that strikes a balance between multiple goals: traffic reduction, accuracy, computation overhead, and generality. It combines three techniques---3-value quantization with sparsity multiplication, base-3^5 encoding, and zero-run encoding---to leverage the strengths of quantization and sparsification techniques and avoid their drawbacks. 

3LC achieves a data compression ratio of up to 39--107X, preserves the high test accuracy of trained models, and provides high compression speed. Distributed ML frameworks can use 3LC without modifications to existing ML algorithms. Our experiments show that 3LC reduces wall-clock training time of ResNet-110 for CIFAR-10 on a bandwidth-constrained 10-GPU cluster by up to 16--23X compared to TensorFlow's baseline design.


Introduction: Auto-charting -- creation structured sections of clinical notes generated directly from a patient-doctor encounter -- holds promise to lift documentation burden from physicians. However, clinicians exercise professional judgement in what and how to document, and it is unknown if a machine learning (ML) model could assist with these tasks.

Objective: Build a ML model to extract symptoms and status (i.e. experienced, not-experienced, not relevant for note) from transcripts of patient-doctor encounters and assess performance on common symptoms and conversations in which a human interpreterscribe is not used.

Methods: We generated a ML model to auto-generate a review of systems (ROS) from transcripts of 90,000 de-identified medical encounters. 2950 transcripts were labeled by medical scribes to identify 171 common symptoms. Model accuracy was stratified by how clearly a symptom was mentioned in conversation for 800 snippets, which was assessed by a formal rating system termed conversational clarity. The model was also qualitatively assessed in a variety of conversational motifs.

Results: Overall, the model had a sensitivity of 0.71 of matching the exact symptom labeled by a human with a positive predictive value of 0.69. Model sensitivity was associated with the clarity of a conversational (p<0.0001). 39.5%% (316/800) snippets of common symptoms contained symptoms mentioned with high clarity, and in this group, the sensitivity of the model was 0.91. The model was robust to a variety of conversational motifs (e.g. detecting symptoms mentioned in colloquial ways).

Conclusions: Auto-generating a review of systems is feasible across a wide-range symptoms that are commonly discussed in doctor-patient encounter


Datasets often contain biases which unfairly disadvantage certain groups, and classifiers trained on such datasets can inherit these biases. In this paper, we provide a mathematical formulation of how this bias can arise. We do so by assuming the existence of underlying, unknown, and unbiased labels which are overwritten by an agent who intends to provide accurate labels but may have biases against certain groups. 

Despite the fact that we only observe the biased labels, we are able to show that the bias may nevertheless be corrected by re-weighting the data points without changing the labels. We show, with theoretical guarantees, that training on the re-weighted dataset corresponds to training on the unobserved but unbiased labels, thus leading to an unbiased machine learning classifier. 

Our procedure is fast and robust and can be used with virtually any learning algorithm. We evaluate on a number of standard machine learning fairness datasets and a variety of fairness notions, finding that our method outperforms standard approaches in achieving fair classification.


Creative generative machine learning interfaces are stronger when multiple actors bearing different points of view actively contribute to them. User experience (UX) research and design involvement in the creation of machine learning (ML) models help ML research scientists to more effectively identify human needs that ML models will fulfill. The People and AI Research (PAIR) group within.

Google developed a novel program method in which UXers are embedded into an ML research group for three months to provide a human-centered perspective on the creation of ML models. The first full-time cohort of UXers were embedded in a team of ML research scientists focused on deep generative models to assist in music composition. 

Here, we discuss the structure and goals of the program, challenges we faced during execution, and insights gained as a result of the process. We offer practical suggestions for how to foster communication between UX and ML research teams and recommended UX design processes for building creative generative machine learning interfaces.

How to Develop Machine Learning Models for Healthcare

Advances in machine learning (ML), faster processors, and the availability of digitized healthcare data have contributed to a growing number of papers describing ML applications in healthcare. 

A common goal of these ML models is to improve patient care, both for clinicians and patients. In this piece, we will discuss the importance of the intended use of the ML model and its role throughout the process: problem selection, data collection, ML model development, validation, assessment of impact, deployment, and monitoring. 

We will focus our discussion on ML models for diagnosis (disease presence) or prognosis (risk of future outcome), both of which involve predicting a label based on input data. 

These principles are also applicable to other clinical applications such as image segmentation for radiation therapy planning and measuring cardiac parameters from echocardiography.

TensorFlow.js: Bringing Machine Learning to the Web and Beyond

TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. 

TensorFlow.js has empowered a new set of developers from the extensive JavaScript community to build and deploy machine learning models and enabled new classes of on-device computation. This paper describes the design, API, and implementation of TensorFlow.js, and highlights some of the impactful use cases.


We develop AirLearning, a tool suite for endto-end closed-loop UAV analysis, equipped with a customized yet randomized environment generator in order to expose the UAV with a diverse set of challenges. 

We take Deep Q networks (DQN) as an example deep reinforcement learning algorithm and use curriculum learning to train a point to point obstacle avoidance policy. While we determine the best policy based on the success rate, we evaluate it under strict resource constraints on an embedded platform such as RasPi 3. 

Using hardware in the loop methodology, we quantify the policy’s performance with quality of flight metrics such as energy consumed, endurance and the average length of the trajectory. We find that the trajectories produced on the embedded platform are very different from those predicted on the desktop, resulting in up to 26.43%% longer trajectories. 

Quality of flight metrics with hardware in the loop characterizes those differences in simulation, thereby exposing how the choice of onboard compute contributes to shortening or widening of ‘Sim2Real’ gap.

Google’s mission is to organize the world’s information and make it universally accessible and useful. AI is helping us do that in exciting new ways, solving problems for our users, our customers, and the world.

AI is making it easier for people to do things every day, whether it’s searching for photos of loved ones, breaking down language barriers in Google Translate, typing emails on the go, or getting things done with the Google Assistant. AI also provides new ways of looking at existing problems, from rethinking healthcare to advancing scientific discovery.