GUAC Explained in 5 Minutes

GUAC stands for Graph for Understanding Artifact Composition and was developed by Google in collaboration with industry leaders to make it easier to understand the influx of security metadata generated by artifacts in the software development lifecycle. As the threat landscape evolves, forming a coalition to create a common framework with the goal of leveraging security metadata can lead to more secure software. In this blog, we will provide a quick overview of GUAC and describe how it can help you, once GUAC reaches maturity, untangle the complexity of security and dependency metadata.  

Who Ordered The GUAC? 

The wealth of systems, services and technologies involved in creating software is at an all-time high. Ensuring every artifact you consume is secure in the SDLC is extremely difficult, particularly when it’s difficult to understand the composition of artifacts in software development.  

Security and development teams need to identify vulnerabilities or weak points in their applications quickly and easily. GUAC provides a graph-based data model that enables users to store, query, analyze, and visualize data related to the structure of their software artifact components. It also provides a standardized format for exchanging data between different security tools and services, making it easy to use across various platforms.

For example, when the Log4Shell vulnerability was discovered, organizations were tasked with finding which artifacts were affected. This proved challenging as it required correlating every artifact back to the source code that generated it and examining its manifest files to locate a reference of vulnerable Log4j versions, which could span thousands of metadata documents across hundreds of sources.   

Recently, multiple tools and frameworks have been introduced to help you gain visibility into different parts of an artifact’s development process. A popular approach is delivering additional security metadata alongside every artifact that the organization uses, usually in the form of files in specific structures and formats.  Security metadata documents can list the components that make up the software, as in the case of SBOMs, describe the build process, as in SLSA provenance, or attest to any process that was part of the artifact’s development lifecycle.  

Circling back to the Log4Shell example – can I use security metadata to find out which artifacts are affected? Security metadata documents can indeed answer this question and many others. However, there is an additional problem – most are provided as raw files intended to be consumed by automated processes. They can’t be scanned in an efficient, uniform manner, making it challenging to get full value from the data, especially as the volume of metadata documents grows across hundreds or thousands of sources.  

What is GUAC’s Recipe for Success?  

GUAC provides a way to load software security metadata to a graph database using an automated process. The different types of metadata are ingested, processed, and linked to each other, so that related entities are connected in the graph representation. By providing an interconnected representation of artifacts used, it helps security and development teams identify any unanticipated dependencies or potential vulnerabilities that arise. After ingestion, the metadata is available to query using standardized graph database tools, allowing the user to explore the security metadata and enforce security and remediation efforts in a more methodical way. 

Currently, GUAC is in the early development stages. Early demos show how to use GUAC to load locally stored security metadata into a Neo4j graph database, and then execute a couple example queries over Neo4j's standard query interface. In the future, GUAC should provide additional capabilities, such as a unified query interface for users, a collector to fetch remotely stored metadata, and a policy engine to limit trust to certain types of metadata.


Benefits Of GUAC? 

GUAC provides several potential advantages: it makes it easier for developers to understand the composition of artifacts in their applications; it provides a standardized format for exchanging data between different security tools; and it helps identify potential vulnerabilities or weak points quickly and easily before they become more major issues down the line.  

The graph representation of security metadata supplied by GUAC can be leveraged to answer security questions like the following 

  1. Is one of my artifacts affected by an open-source vulnerability (e.g., Log4Shell)? 
    1. This could be checked using a query for the vulnerable package and seeing if any artifacts depend on it. 
  2. Are all my artifacts SLSA-compliant? 
    1. This could query all SLSA provenance documents and test whether all top-level artifacts have a corresponding SLSA provenance. 
  3. Which artifacts could've been tampered with by a potentially compromised actor X in my organization? 
    1. This could be checked by querying for all artifacts in the graph database that have been signed off by this actor. 

Once GUAC matures, more advanced query capabilities are inevitable and can be used to gain deeper insights into the development process and artifacts within an organization. 

Is GUAC Ready To Serve?  

Not yet. GUAC is a proof of concept and a visionary project that attempts to lay a foundation for answering the questions above. The strength of GUAC will be realized once a community around it solidifies and tooling becomes available and more widely supported. Once that starts to occur, you should consider implementing GUAC for the benefits outlined above. 

As GUAC matures, it's likely that the community around GUAC will create a query library with readymade queries to answer common security questions. This is much like Semgrep and the Semgrep registry, where the development of a SAST rule engine was leveraged to create a rule library to detect various suboptimal coding patterns. 

Utilizing pre-made queries to answer security questions, such as those presented above, will further facilitate the benefits of using GUAC and reduce the barriers to entry for security professionals, allowing them to gain useful insights into their organization's development practices. 

Can I Get GUAC’s Benefits Today?  

Yes, you can. Legit Security provides the benefits associated with GUAC and much more in an enterprise platform with an established Fortune 500 enterprise customer base. The Legit Security platform secures your development pipelines for gaps and leaks, the SDLC infrastructure and systems within those pipelines, and the people and their security posture as they operate within it. Legit Security secures this broader software supply chain environment with real-time visibility and risk scoring, and leverages a query-able graph database for SDLC modelling, system and artifact mapping, and vulnerability management.  

To explore the benefits of end-to-end software supply chain security in more detail, schedule a demo today.  

Share this guide

Published on
January 31, 2023

Book a 30 minute demo including the option to analyze your own software supply chain, if desired.