Skip to main content

Step 1: Installing google-alerts for Python

First, let’s install the google-alerts library for Python and initialize our Google Alerts session. To begin, run the following command in your terminal: 
pip install google-alerts

Once installed, you’ll need to configure your Google Alerts session by entering your email and password. This can be done with the following setup command:

google-alerts setup --email <your-email-address> --password '<your-password>'

However, due to the library not being updated since 2020, it only works with Chrome Driver and Google Chrome version 84. You’ll need to install both Chrome Driver v84 and Google Chrome v84—be cautious not to overwrite your existing Chrome version during installation. After installation, seed the Google Alerts session using:

google-alerts seed --driver /tmp/chromedriver --timeout 60

Step 2: Creating Your First Alert

Once the session is seeded, you can start creating alerts using Python, either in a script or Jupyter notebook. First, authenticate using the following code:

from google_alerts import GoogleAlerts

ga = GoogleAlerts('<your_email_address>', '<your_password>')
ga.authenticate()

Now, to create your first Google Alert for a specific search term, such as “Barcelona” in Spain, use this:

ga.create("Barcelona", {'delivery': 'RSS', "language": "es", 'monitor_match': 'ALL', 'region' : "ES"})

If successful, the function will return an object containing details such as the search term, language, region, match type, and the RSS link for that alert.

Note: Unfortunately, it’s not currently possible to create an alert that monitors a term across all countries. If you leave the language and region fields blank, it defaults to English and the USA.

To review active alerts, use:

ga.list()

If you need to delete an alert, reference its monitor_id and use:
ga.delete("monitor_id")

Step 3: Parsing the RSS Feed

With the alert set up, let’s move on to parsing the RSS feed. We will extract key information such as the alert ID, title, publication date, URL, and content. Using the requests library and BeautifulSoup, we can extract and structure the data:

import requests
from bs4 import BeautifulSoup as Soup

r = requests.get('<your RSS feed>')
soup = Soup(r.text, 'xml')

id_alert = [x.text for x in soup.find_all("id")[1:]]
title_alert = [x.text for x in soup.find_all("title")[1:]]
published_alert = [x.text for x in soup.find_all("published")]
update_alert = [x.text for x in soup.find_all("updated")[1:]]
link_alert = [[x["href"].split("url=")[1].split("&ct=")[0]] for x in soup.find_all("link")[1:]]
content_alert = [x.text for x in soup.find_all("content")]

compiled_list = [[id_alert[x], title_alert[x], published_alert[x], update_alert[x], link_alert[x], content_alert[x]] for x in range(len(id_alert))]

This code will generate a comprehensive list with all relevant metrics for each alert. If desired, you can save the results to an Excel file using Pandas:

import pandas as pd

df = pd.DataFrame(compiled_list, columns=["ID", "Title", "Published on", "Updated on", "Link", "Content"])
df.to_excel('new_alerts.xlsx', header=True, index=False)

Step 4: Automating Outreach to Sites

Leveraging Google Alerts with Python becomes particularly useful when automating outreach to websites that mention your brand or specific keywords. Using Python, you can scrape each URL and attempt to locate contact information such as email addresses.

Here’s a sample script that finds email addresses within the page content and links to contact pages:

import re
import requests
from bs4 import BeautifulSoup as Soup

for iteration in link_alert:
    request_link = requests.get(iteration[0])
    soup = Soup(request_link.text, 'html')
    
    body = soup.find("body").text
    match = [x for x in re.findall(r'[\w.+-]+@[\w-]+\.[\w.-]+', body) if ".png" not in x]

    contact_urls = []
    links = soup.find_all("a")
    for y in links:
        if "contact" in y.text.lower():
            contact_urls.append(y["href"])
    
    iteration.append([match])
    iteration.append([contact_urls])

Once you’ve compiled the list of email addresses, you can even automate sending thank-you emails using smtplib:

from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
import smtplib

msg = MIMEMultipart()
password = '<your email address password>'
msg['From'] = "<your email address>"
msg['To'] = "<Receiver email address>"
msg['Subject'] = "Thank you for mentioning my brand!"

message = "<p>Dear Sir or Madam,<br><br>Thank you for mentioning my brand in your article: " + URL + ". I would appreciate it if you could include a link to my website at https://www.example.com.<br><br>Thanks in advance!</p>"
msg.attach(MIMEText(message, 'html'))

server = smtplib.SMTP('smtp.gmail.com: 587')
server.starttls()
server.login('<your email address>', password)
server.sendmail(msg['From'], msg['To'], msg.as_string())
server.quit()

By combining these steps, you can automate both the monitoring of online mentions and the outreach process, streamlining your brand management and awareness efforts.


Daniel Dye

Daniel Dye is the President of NativeRank Inc., a premier digital marketing agency that has grown into a powerhouse of innovation under his leadership. With a career spanning decades in the digital marketing industry, Daniel has been instrumental in shaping the success of NativeRank and its impressive lineup of sub-brands, including MarineListings.com, LocalSEO.com, MarineManager.com, PowerSportsManager.com, NikoAI.com, and SearchEngineGuidelines.com. Before becoming President of NativeRank, Daniel served as the Executive Vice President at both NativeRank and LocalSEO for over 12 years. In these roles, he was responsible for maximizing operational performance and achieving the financial goals that set the foundation for the company’s sustained growth. His leadership has been pivotal in establishing NativeRank as a leader in the competitive digital marketing landscape. Daniel’s extensive experience includes his tenure as Vice President at GetAds, LLC, where he led digital marketing initiatives that delivered unprecedented performance. Earlier in his career, he co-founded Media Breakaway, LLC, demonstrating his entrepreneurial spirit and deep understanding of the digital marketing world. In addition to his executive experience, Daniel has a strong technical background. He began his career as a TAC 2 Noc Engineer at Qwest (now CenturyLink) and as a Human Interface Designer at 9MSN, where he honed his skills in user interface design and network operations. Daniel’s educational credentials are equally impressive. He holds an Executive MBA from the Quantic School of Business and Technology and has completed advanced studies in Architecture and Systems Engineering from MIT. His commitment to continuous learning is evident in his numerous certifications in Data Science, Machine Learning, and Digital Marketing from prestigious institutions like Columbia University, edX, and Microsoft. With a blend of executive leadership, technical expertise, and a relentless drive for innovation, Daniel Dye continues to propel NativeRank Inc. and its sub-brands to new heights, making a lasting impact in the digital marketing industry.

More Articles By Daniel Dye

Social media and SEO (Search Engine Optimization) have a symbiotic relationship. While social signals themselves may not be a direct ranking factor, a strong social media presence can enhance your SEO efforts. Social platforms drive traffic, boost brand visibility, and help create valuable backlinks. Understanding how each social network aligns with SEO efforts allows businesses […]
Negative Google reviews are often a source of frustration for business owners, whether they arise from customer misunderstandings, high expectations, or deliberate attempts to damage a business’s reputation. However, negative feedback doesn’t have to mean disaster. When handled strategically, even the worst reviews can be an opportunity to rebuild trust, enhance your customer service, and […]

Was this helpful?