Free to read. Sign up to save your progress and take knowledge-check quizzes.

Sign up free
5 min read·Updated April 28, 2026

Microsoft Copilot Vision

Microsoft logoBy Microsoft

Microsoft Copilot Vision is a screen-aware AI capability that lets Copilot see and understand what's on your screen in real time — providing contextual help, explanations, and actions based on what you're looking at in your browser or Windows environment.

Listen to this lesson

Free preview · first 0:30
0:00 / 0:30

Audio & video lessons are paid features

Plus unlocks audio streaming. Pro adds downloadable audio, video, certificates, and more.

Plus adds:
  • Audio streaming
  • Downloadable PDFs
  • All AI Playbooks
  • Personalized content
Pro also adds:
  • Certificates of completion
  • Audio MP3 downloads
  • Video lessonssoon
  • & More…soon

Watch this lesson

Video coming soon

Learning Objectives

  • Understand what Microsoft Copilot Vision is and how screen awareness differs from standard AI chat
  • Identify the key contexts where Copilot Vision is available: Edge browser and Windows
  • Evaluate how Copilot Vision fits into Microsoft's broader Copilot ecosystem

What Is Microsoft Copilot Vision?

Microsoft Copilot Vision is a capability within Microsoft Copilot that allows the AI to see and understand your current screen content in real time. Rather than describing what you're looking at or copying and pasting text into a chat, Copilot Vision observes your active browser tab or Windows screen and provides contextual assistance based on what's visible — explanations, summaries, suggestions, and actions tied to your current context.

Copilot Vision is part of Microsoft's strategy to make Copilot the AI layer embedded everywhere in Windows and Microsoft products — not a standalone tool you switch to, but an assistant that's always aware of what you're doing and ready to help without requiring you to re-explain your context.

Tip

Try Copilot Vision: Available in Microsoft Edge browser (Copilot sidebar → Vision tab) and in Windows 11 Copilot. Requires a Microsoft account; some features require Copilot Pro ($20/month). Access Edge at microsoft.com/edge.

Key Features

Browser Screen Reading (Edge)

In Microsoft Edge, Copilot Vision can read the current webpage:

  • Page summaries: "Summarize this article for me" without copying any text
  • Contextual Q&A: "What does [term on this page] mean?" or "What is the key argument being made here?"
  • Shopping assistance: On product pages, Copilot can read the product details and compare with similar items or highlight important specifications
  • Research assistance: Reading a long document and asking targeted questions about its content
  • Accessibility: Reading and explaining complex content for users who need assistance understanding dense material

Windows Screen Awareness

On Windows 11, Copilot can observe the active window:

  • Application assistance: "How do I do [task] in this application?" with Copilot seeing what application is open
  • Document assistance: Reading the document you have open and answering questions about its content
  • Code assistance: Looking at code you have open and explaining what it does or identifying issues
  • Settings guidance: "Where do I find [setting]?" with Copilot seeing what Windows screen you're on

💡Key Concept

Screen-aware vs. computer-use agents: Copilot Vision is primarily observational and advisory — it sees your screen and helps you understand and navigate what's there, but does not take autonomous actions on your behalf. ChatGPT Operator and Claude Computer Use are action-taking agents that click, fill forms, and complete tasks. Vision is the "smart assistant looking over your shoulder" model; Operator/Computer Use is the "agent doing things for you" model.

Copilot in Microsoft 365 (Document Context)

Within Microsoft 365 apps (Word, Excel, PowerPoint, Outlook), Copilot has native document awareness:

  • Word: Copilot can read the entire document and answer questions, suggest edits, or summarize sections
  • Excel: Understands the data in your spreadsheet — can answer questions, create formulas, and generate charts
  • PowerPoint: Reviews your presentation and suggests improvements, adds slides, or generates speaker notes
  • Outlook: Reads your email thread and drafts replies, summarizes long threads, or extracts action items

This represents the deepest integration of Vision-style contextual awareness — Copilot has full access to document contents because it's operating within the Microsoft 365 ecosystem.

Microsoft Copilot Ecosystem Context

Copilot Vision is one component of Microsoft's layered Copilot strategy:

LayerProductVision Capability
Web browserEdge CopilotReads current webpage; contextual Q&A
Operating systemWindows 11 CopilotSees active application; Windows-wide assistance
Productivity suiteMicrosoft 365 CopilotFull document/spreadsheet/email context
Enterprise dataCopilot for Microsoft 365 (enterprise)Secure access to org's emails, documents, Teams

Pricing

Free (Microsoft account)$0
  • Basic Copilot in Edge
  • Limited Vision queries
Copilot Pro$20/month
  • Full Vision in Edge
  • Microsoft 365 Copilot integration
  • Priority access to GPT-5.5
Microsoft 365 Copilot (enterprise)$30/user/month
  • Full M365 integration
  • Organizational data access
  • Teams Copilot

Strengths

  • Zero friction context sharing: No copy-paste or file uploads needed — Copilot sees what you see
  • Deep Microsoft 365 integration: Native document understanding in Word, Excel, PowerPoint, Outlook
  • Privacy within the Microsoft ecosystem: Enterprise plans keep data within Microsoft's compliance boundaries
  • Always available in Edge: Edge browser users have Copilot Vision available without switching tools
  • Windows-wide presence: Available across the OS, not just in one application
  • GPT-5.5 backend: Copilot Pro and M365 Copilot use OpenAI's frontier model

Limitations & Considerations

  • Observational, not agentic: Copilot Vision advises and explains but does not autonomously complete tasks on your screen
  • Edge and Windows only: Full Vision capabilities require Microsoft's browser and OS; not available in Chrome or on Mac/Linux
  • Quality varies by context: Works best in structured Microsoft 365 documents; less reliable on complex or non-standard web layouts
  • Pro subscription for full features: Free tier has limited Vision access; full capability requires Copilot Pro ($20/month)
  • Enterprise subscription required for M365: Full organizational data awareness requires the $30/user/month Microsoft 365 Copilot plan

Best Use Cases

TaskWhy Copilot Vision
Research and article readingSummarize, explain, and Q&A on any webpage without copy-paste
Microsoft 365 document workNative Word/Excel/PowerPoint understanding; in-app assistance
Windows application helpContextual "how do I do this?" assistance based on active app
Email managementOutlook thread summarization and reply drafting with full context
Shopping researchProduct page analysis and comparison without manual data entry

When to choose alternatives:

  • Autonomous task completion → ChatGPT Operator
  • Full desktop computer control → Claude Computer Use
  • Real-time web information → Perplexity Assistant
  • AI in non-Microsoft browsers → Claude for Chrome or browser-specific extensions

Getting Started

  1. Open Microsoft Edge — download at microsoft.com/edge
  2. Click the Copilot icon in the Edge toolbar (top right) to open the sidebar
  3. Navigate to any webpage and ask Copilot "Summarize this page" — Copilot Vision reads the current tab automatically
  4. Try in a Microsoft 365 document: open Word, then open Copilot and ask "What are the main points of this document?"
  5. For full capabilities, consider Copilot Pro ($20/month) for priority access and Microsoft 365 integration

Tip

Most useful for Microsoft 365 users: If your workflow centers on Word, Excel, PowerPoint, and Outlook, Copilot Vision's native document context is genuinely valuable — asking "What formula would calculate the percentage change across column B?" while looking at your actual spreadsheet is faster and more accurate than describing the spreadsheet in a separate chat window. The Edge browser integration is a solid bonus; the real value proposition is inside the Microsoft 365 suite.

Key Takeaways

  • Microsoft Copilot Vision lets Copilot see your current screen — webpage, Windows application, or Microsoft 365 document — and provide contextual assistance without requiring you to describe or copy your content
  • Available in Microsoft Edge (webpage reading), Windows 11 (active app awareness), and deeply integrated into Microsoft 365 apps (Word, Excel, PowerPoint, Outlook)
  • Primarily observational and advisory — it helps you understand and work with what's on screen, rather than taking autonomous actions like ChatGPT Operator
  • Most powerful for users already in the Microsoft ecosystem: Windows 11 + Edge + Microsoft 365 integration unlocks the full vision of screen-aware AI assistance
  • Copilot Pro ($20/month) and Microsoft 365 Copilot ($30/user/month) unlock the full capability; basic screen reading in Edge is available free

Save your progress & take the quiz

Sign up free to bookmark lessons, track which modules you've completed, and lock in what you learned with a quick knowledge-check quiz at the end of each lesson.

🧭Recommended for you