Skip to main content

Using the PageSpeed Insights API with Python: A Complete Guide with Code Examples

Google’s PageSpeed Insights (PSI) is a powerful tool that analyzes the performance of a website, offering suggestions to improve speed and user experience. For SEOs and web developers, automating this process can save time and streamline workflows. In this article, we’ll dive into how to use the PageSpeed Insights API with Python to automate performance reports, including step-by-step instructions and code examples.

What is PageSpeed Insights?

PageSpeed Insights (PSI) provides metrics about a page’s performance on both mobile and desktop devices. The tool offers detailed feedback on load speed, interactivity, visual stability, and overall user experience. It scores websites and provides recommendations for optimization.

Why Use the PageSpeed Insights API with Python?

  • Automation: Rather than manually checking every page, you can automatically generate performance reports for multiple URLs.
  • Efficiency: Python scripts can help you retrieve and process data quickly, especially when dealing with large numbers of web pages.
  • Custom Reports: With Python, you can format the PSI data however you like, making it easier to analyze and integrate into existing workflows.

Step 1: Getting the API Key

To use the PageSpeed Insights API, you first need an API key from Google Cloud. Follow these steps to get one:

  1. Go to the Google Cloud Console.
  2. Create a new project or select an existing one.
  3. In the navigation menu, go to APIs & Services > Credentials.
  4. Click Create Credentials and select API Key.
  5. Save the generated API key. You’ll need this for your Python script.

Step 2: Installing Required Python Libraries

You need the requests library to send API requests and json to parse the response data. If you don’t have requests installed, you can install it using:

pip install requests

Step 3: Writing the Python Script

Below is a sample Python script that calls the PageSpeed Insights API, fetches the performance data for a given URL, and processes the results.

import requests
import json

# Define the API key and endpoint
API_KEY = 'your_api_key_here'
API_URL = 'https://www.googleapis.com/pagespeedonline/v5/runPagespeed'

def get_pagespeed_insights(url, strategy='desktop'):
    # Set up parameters for the API call
    params = {
        'url': url,
        'key': API_KEY,
        'strategy': strategy  # Either 'mobile' or 'desktop'
    }

    # Make the API request
    response = requests.get(API_URL, params=params)

    if response.status_code == 200:
        data = response.json()
        return data
    else:
        print(f"Error: {response.status_code}")
        return None

def parse_pagespeed_data(data):
    # Extracting relevant metrics from the API response
    performance_score = data['lighthouseResult']['categories']['performance']['score'] * 100
    metrics = data['lighthouseResult']['audits']['metrics']['details']['items'][0]

    # Core Web Vitals
    first_contentful_paint = metrics['firstContentfulPaint'] / 1000  # in seconds
    speed_index = metrics['speedIndex'] / 1000  # in seconds
    largest_contentful_paint = metrics['largestContentfulPaint'] / 1000  # in seconds
    total_blocking_time = metrics['totalBlockingTime']  # in milliseconds
    cumulative_layout_shift = metrics['cumulativeLayoutShift']

    print(f"Performance Score: {performance_score}")
    print(f"First Contentful Paint: {first_contentful_paint} sec")
    print(f"Speed Index: {speed_index} sec")
    print(f"Largest Contentful Paint: {largest_contentful_paint} sec")
    print(f"Total Blocking Time: {total_blocking_time} ms")
    print(f"Cumulative Layout Shift: {cumulative_layout_shift}")

def main():
    # URL to be analyzed
    url = 'https://www.example.com'

    # Fetch data from PageSpeed Insights API
    data = get_pagespeed_insights(url, strategy='mobile')

    if data:
        # Parse and display the performance metrics
        parse_pagespeed_data(data)

if __name__ == "__main__":
    main()

Explanation of the Script

1. API Call Function

The get_pagespeed_insights() function sends a GET request to the PageSpeed Insights API endpoint. It accepts two parameters:

  • url: The URL of the webpage you want to analyze.
  • strategy: The strategy can be either mobile or desktop, depending on the device type you wish to optimize for.

The function returns a JSON object containing detailed performance data.

2. Parsing and Displaying Data

The parse_pagespeed_data() function extracts important performance metrics from the JSON response:

  • Performance Score: A score out of 100 that reflects the overall page speed.
  • First Contentful Paint (FCP): The time it takes for the first piece of content to be visible to the user.
  • Speed Index: A metric that shows how quickly the contents of the page are visually populated.
  • Largest Contentful Paint (LCP): The render time of the largest content element visible within the viewport.
  • Total Blocking Time (TBT): The total time a page is blocked by JavaScript execution, preventing user interaction.
  • Cumulative Layout Shift (CLS): Measures visual stability by calculating how much the page layout shifts during load.

These values are printed out to the console, but you can easily save them to a file or database for further analysis.

Step 4: Analyzing Multiple URLs

If you need to check multiple URLs, simply modify the script to loop through a list of URLs.

urls = ['https://www.example1.com', 'https://www.example2.com']

for url in urls:
    data = get_pagespeed_insights(url, strategy='desktop')
    if data:
        print(f"Performance for {url}:")
        parse_pagespeed_data(data)
        print("n")

Step 5: Storing Data in a CSV File

You might want to save the results to a CSV file for easy reference. Here’s how you can extend the script to do that:

import csv

def save_to_csv(metrics, filename='pagespeed_results.csv'):
    with open(filename, mode='a', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(metrics)

def main():
    # URL to be analyzed
    url = 'https://www.example.com'

    # Fetch data from PageSpeed Insights API
    data = get_pagespeed_insights(url, strategy='mobile')

    if data:
        # Parse performance metrics
        performance_score, fcp, si, lcp, tbt, cls = parse_pagespeed_data(data)

        # Save the results to a CSV file
        save_to_csv([url, performance_score, fcp, si, lcp, tbt, cls])

if __name__ == "__main__":
    main()

Conclusion

By leveraging the PageSpeed Insights API with Python, you can automate the process of performance monitoring for your website or client’s websites. This allows you to quickly generate reports, identify bottlenecks, and optimize user experience across devices. With the provided code, you can fetch data, process it, and even store it for future analysis, making your SEO tasks more efficient and actionable.

With Google’s emphasis on performance metrics for ranking, using the PageSpeed Insights API programmatically provides a significant edge in optimizing websites for speed and improving search engine visibility.

Happy coding!


Daniel Dye

Daniel Dye is the President of NativeRank Inc., a premier digital marketing agency that has grown into a powerhouse of innovation under his leadership. With a career spanning decades in the digital marketing industry, Daniel has been instrumental in shaping the success of NativeRank and its impressive lineup of sub-brands, including MarineListings.com, LocalSEO.com, MarineManager.com, PowerSportsManager.com, NikoAI.com, and SearchEngineGuidelines.com. Before becoming President of NativeRank, Daniel served as the Executive Vice President at both NativeRank and LocalSEO for over 12 years. In these roles, he was responsible for maximizing operational performance and achieving the financial goals that set the foundation for the company’s sustained growth. His leadership has been pivotal in establishing NativeRank as a leader in the competitive digital marketing landscape. Daniel’s extensive experience includes his tenure as Vice President at GetAds, LLC, where he led digital marketing initiatives that delivered unprecedented performance. Earlier in his career, he co-founded Media Breakaway, LLC, demonstrating his entrepreneurial spirit and deep understanding of the digital marketing world. In addition to his executive experience, Daniel has a strong technical background. He began his career as a TAC 2 Noc Engineer at Qwest (now CenturyLink) and as a Human Interface Designer at 9MSN, where he honed his skills in user interface design and network operations. Daniel’s educational credentials are equally impressive. He holds an Executive MBA from the Quantic School of Business and Technology and has completed advanced studies in Architecture and Systems Engineering from MIT. His commitment to continuous learning is evident in his numerous certifications in Data Science, Machine Learning, and Digital Marketing from prestigious institutions like Columbia University, edX, and Microsoft. With a blend of executive leadership, technical expertise, and a relentless drive for innovation, Daniel Dye continues to propel NativeRank Inc. and its sub-brands to new heights, making a lasting impact in the digital marketing industry.

More Articles By Daniel Dye

Here’s how you can automate sending daily email reports in Python using smtplib for sending emails and scheduling the job with the schedule or APScheduler library. I’ll walk you through the process step by step. Step 1: Set Up Your Email Server Credentials To send emails using Python, you’ll need access to an email SMTP […]
Google’s search algorithm is one of the most sophisticated systems on the internet. It processes millions of searches every day, evaluating the relevance and quality of billions of web pages. While many factors contribute to how Google ranks search results, the underlying system is based on advanced mathematical models and principles. In this article, we’ll […]

Was this helpful?