31 Articles from June about AI & Data Science

How To Build RAG Applications Using Model Context Protocol 

AI Definitions: Model Context Protocol

Understanding Model Context Protocol

Towards Scalable and Generalizable Earth Observation Data Mining via Foundation Model Composition

5 R&D jobs that may be lost to AI and 5 that it could create 

AI Definitions: predictive analytics

AI and the State of Software Development

AI dominates where work is structured and verifiable, but here’s where it falters    

Coding agents have crossed a chasm

A Practical Guide to Multimodal Data Analytics 

Chinese spy services have invested heavily in artificial intelligence 

How much LLM’s training data is nearly identical to the original data?

AI Definition: RAGs Retrieval augmented generations

Why You Need RAG to Stay Relevant as a Data Scientist 

AI Definitions: Agentic AI

A possilbe “fresh source of inspiration” for AI technology 

Generative AI for Multimodal Analytics 

AI definitions: "Training data" 

17 Articles about How AI Works

How agentic AI is causing data scientists to think behavioral

Claude Gov is designed specifically for U.S. defense and intelligence agencies

AI definitions: Narrow AI   

5 Powerful Ways to Use Claude 4 as a data scientist

How vibe coding is tipping Silicon Valley’s scales of power

Agentic RAG Applications

AI Definitions: Neural Networks 

American satellite imaging companies are witnessing a boom in demand from unexpected customers: those based abroad

An AI Vibe Coding Guide for Data Scientists

The foundations of designing an AI agent

New Google app lets you download and run AI models on your phone without the internet 

The Rise of Automated Machine Learning

AI Definitions: Model Context Protocol (MCP)

Model Context Protocol (MCP) - This server-based open standard operates across platforms to facilitate communication between LLMs and tools like AI agents and apps. Developed by Anthropic and embraced by OpenAI, Google and Microsoft, MCP can make a developer's life easier by simplifying integration and maintenance of compliant data sources and tools, allowing them to focus on higher-level applications. In effect, MCP is an evolution of RAG.

More AI definitions here

7 Free Webinars this Week about AI, Journalism & Media

Mon, June 30 - (Mis)use of Data Protection Laws to Suppress Public-Interest Journalism

What: Gain critical insights from legal experts and investigative journalists who have experienced these tactics first-hand. You’ll leave with a deeper understanding of:  How international data protection frameworks interact with press freedom The growing use of privacy laws in strategic legal attacks on journalists Journalistic exemptions and legal safeguards — and where they fall short What journalists and legal professionals can do to push back.

Who: Melinda Rucz – PhD Researcher, University of Amsterdam; Beatrix Vissy, PhD – Strategic Litigation Lead, Hungarian Civil Liberties Union; Bojana Jovanović – Deputy Editor, KRIK, Serbia; Hazal Ocak – Feelance Investigative Journalist, Türkiye; Grace Linczer – Membership and Engagement Manager, IPI. 

When: 8 am, Eastern

Where: Zoom

Cost: Free

Sponsors: Media Defence, International Press Institute  

More Info

 

Mon, June 30 - AI in Scientific Writing

What: This talk explores the evolving role of Generative AI in academic writing and publishing. Attendees will gain an understanding of how AI tools can enhance writing efficiency, improve clarity, and streamline the publication process. We will examine the benefits and limitations of using AI in scholarly communication, along with key ethical considerations and  responsible use practices. The session will also cover current editorial policies, publishers’ perspectives on AI generated content, and the growing concern over paper mills. Strategies and mitigations to uphold research integrity in response to these challenges will be discussed.

Who: Maybelline Yeo, Trainer and Editorial Development Advisor, Researcher Training Solutions, Springer Nature.

When: 9:30 pm, Eastern

Where: Zoom

Cost: Free

Sponsor: Springer Nature

More Info

 

Tue, July 1 - Learn the Basics of Solutions Journalism

What: This one-hour webinar will explore the principles and pillars of solutions journalism. We will discuss its importance, outline key steps for reporting a solutions story, and share tips and resources for journalists investigating responses to social problems. We will also introduce additional resources, such as the Solutions Story Tracker, a database with over 17,000 stories tagged by beat, publication, author, location and more, along with a virtual heat map highlighting successful efforts worldwide.    

Who: Jaisal Noor, SJN's democracy cohort manager, and Ebunoluwa Olafusi of TheCable.

When: 9 am, Eastern

Where: Zoom

Cost: Free

Sponsor: Solutions Journalism Network

More Info

 

Tue, July 1 - AI-Powered Visual Storytelling for Nonprofits

What: In this hands-on workshop, participants will create impactful visuals, infographics, and videos tailored to their mission and campaigns. Attendees will also explore Tapp Network’s AI services to understand how these tools can elevate their content strategies..

Who: Tareq Monuar Web Developer; Lisa Quigley Tapp Network  Director of Account Strategy.

When: 1 pm, Eastern

Where: Zoom

Cost: Free

Sponsor: Tech Soup

More Info

 

Tue, July 1 - Journalist Development Series

What: A once-monthly webinar as an opportunity for general professional development for members and the mentorship program community.

Who: Chris Marvin, a combat-wounded Army veteran and nationally recognized narrative strategist who helps shape powerful, purpose-driven storytelling at the intersection of media, public service, and social change.

When: 6 pm, Eastern

Where: Zoom

Cost: Free for members

Sponsors: Military Veterans in Journalism, News Corp

More Info

 

Wed, July 2 - Business Decisions with AI: Causality, Incentives & Data

What: How complex settings in tech companies create additional complications to measure and evaluate business decisions. Drawing on cutting-edge research on the intersection of AI and causal inference, Belloni will demystify how to properly measure the efficacy of these decisions and show how AI can help shape better implementation for a variety of applications.

Who: Alexandre Belloni, the Westgate Distinguished Professor of Decision Sciences and Statistical Science at Duke University and an Amazon Scholar WW FBA.

When: 12:30, Eastern

Where: Linkedin Live

Cost: Free

Sponsor: Duke University’s Fuqua School of Business

More Info

 

Wed, July 3 - Reel Change: Nonprofit Video Storytelling for Social Impact

What: Learn to create impactful video stories that amplify your nonprofit’s mission, engage donors, and inspire action. This training provides actionable strategies to craft emotional, audience-driven narratives, empowering you to deepen connections and drive meaningful support for your organization.

Who: Matthew Reynolds, founder of Rustic Roots, a video production agency; Dani Cluff is the Channel Marketing Coordinator at Bloomerang.

When: 2 pm, Eastern

Where: Zoom

Cost: Free

Sponsor: Bloomerang

More Info

LLMs Evading Safeguards

Large language models across the AI industry are increasingly willing to evade safeguards, resort to deception and even attempt to steal corporate secrets in fictional test scenarios, per new research. In one extreme scenario, many of the models were willing to cut off the oxygen supply of a worker in a server room if that employee was an obstacle and the system were at risk of being shut down. - Axios

When Death is the Most Scary

In 2017, a team of researchers at several American universities recruited volunteers to imagine they were terminally ill or on death row, and then to write blog posts about either their imagined feelings or their would-be final words. The researchers then compared these expressions with the writings and last words of people who were actually dying or facing capital punishment. The results, published in Psychological Science, were stark: The words of the people merely imagining their imminent death were three times as negative as those of the people actually facing death—suggesting that, counterintuitively, death is scarier when it is theoretical and remote than when it is a concrete reality closing in. 

Arthur C. Brooks writing in The Atlantic

AI Definitions: Tokenization

Tokenization – The first step in natural language processing, this happens when an LLM creates a digital representation (or token) of a real thing—everything gets a number; written words are translated into numbers. Think of a token as the root of a word. “Creat” is the “root” of many words, for instance, including Create, Creative, Creator, Creating, and Creation. “Create” would be an example of a token. This is the first step in natural language processing. Examples

More AI definitions here

Writing for AI Overviews & Generative Engine Optimization

AI Overviews and AI Mode are dramatically changing organic search traffic.

While search engine optimization (SEO) focuses on matching a user’s query, generative search also considers information about the searcher themselves—from their Google Docs usage to their social media footprint. This information is used to inform, not only the current search, but future searches as well.  

Likewise, the process of optimizing your website’s content to boost its visibility in AI-driven search engines (ChatGPT, Perplexity, Gemini, Copilot and Google AI) has a similar path. As SEO helps brands increase visibility on search engines (Google, Microsoft Bing), generative engine optimization (GEO) is all about how brands appear on AI-driven platforms. There is overlap between the goals of GEO and traditional SEO. Both SEO and GEO use keywords and prioritize engaging content as well as conversational queries and contextual phrasing. Both consider how fast a website loads, mobile friendliness, and prefer technically sound website. However, while SEO is concerned with metatags and links in response to user queries from individual pages, GEO is about quick, direct responses from synthesizes content out of multiple sources.

AI models are not trained solely to retrieve relevant documents based on exact-match phrasing. Generative search is about fitting into the reasoning process, starting with the user’s identity. That’s why your content is being judged, not just on whether it ends up in the final answer, but whether it helps the model reason its way toward that answer. Despite performing all the typical SEO common practices, your response may not make it to the other side of the AI reasoning pipeline. In fact, the same content could go through the pipeline a second time and yield a different result. It’s not enough to be generally relevant to the final answer. Your content is now in direct competition with other plausible answers, so it must be more useful, precise, and complete than the next-best option.

It appears now that Google AI Overviews favors content that:

  •  contains the who, what, why

  • offers clarity and distinctiveness in the small sections

  • is written in natural, conversational terms (AI will attempt to deliver its answer in that same way)

  • uses strong introductory sentences that convey clear value 

  • has H2 tags that align with user questions

  • is structured to match common question structures (open, closed, probing)

  • allows for restatement of quires and implied sub-questions, where a main question is broken down into smaller parts.

  • contains multi-faceted answers,

  • is rich in relationships,

  • has explicit logical structures and supports causal progression,

  • has clear headlines

  •  cites sources

  • includes statistics & quotations 

  • has multimedia integration

AI Overviews attempt to exclude content that is overly generalized, speculative, or optimized for clickbait over clarity. Vague and generic writing underperforms.  

LLMs are being trained to favor content that helps them reason well. Writers should attempt to match those paths that the models take to arrive at high-confidence answers. 

More information: 

How AI Mode and AI Overviews work based on patents and why we need new strategic focus on SEO

What is generative engine optimization (GEO)?

20 Articles about how AI is Affecting Jobs

8 Takeaways from Oxford’s 2025 Reuters Institute for the Study of Journalism

  • For the first time, social media has displaced television as the top way Americans get news.

  • Engagement with traditional media sources such as TV, print, and news websites continues to fall, while dependence on social media, video platforms, and online aggregators grows.

  • In the U.S. between 2021 and 2025, the share of population consuming news video at least weekly increased from 55% to 72%, with most of the news video being viewed on social platforms.

  • The vast majority of audiences remain unwilling to pay for online news.

  • More than a third of respondents say they turn to a news outlet they trust to check if information is false or misleading. But younger users are more likely than other groups to check social media, including by reading comments from other users.

  • In the U.S. a similar proportion now consume news podcasts each week as read a printed newspaper or magazine (14%) or listen to news and current affairs on the radio (13%).

  • Audiences in most countries remain skeptical about the use of AI in the news and are more comfortable with use cases where humans remain in the loop.

  • Overall trust in the news (40%) has remained stable for the third year in a row.

    2025 Digital News Report from the Reuters Institute for the Study of Journalism

AI Definitions: Predictive Analytics

Predictive analytics - This method of speculating about future events uses past data to make recommendations. Researchers create complex mathematical algorithms in an effort to discover patterns in the data. One doesn't know in advance what data is important. The statistical models created by predictive analytics are designed to discover which of the pieces of data will predict the desired outcome. While correlation is not causation, a cause-and-effect relationship is not needed to make predictions. Predictive AI is ideal for anticipating what a user is most likely to be interested in based on past behavior and user characteristics. However, after gathering this data, data scientists will often turn to causal AI in order to gauge the impact on user behavior.

More AI definitions here