TMW10 Harvesting the Web: Using PowerShell to Scrape Screens, Exploit Web Services, and Save Time


4:00pm - 5:15pm

Level: Introductory

Mark Minasi

IT Consultant, Author, Speaker


For many IT pros, your days start with the same activity: surfing to catch up. We skim our Twitter feeds, view and perhaps download The Picture of the Day from some amazing photo site, check the weather, peek at eBay to see if that prized item is selling under $100 yet, or whatever. That's great, but what isn't so great is the fact that most of us are gathering those data the same way we've been doing it for it for twenty years: by clicking around in a web browser. Hey, it's the 21st century, and all the cool kids are automating things. So why aren't you using PowerShell's tools to accomplish your daily web data–gathering tasks, whether personal– or business–related?

PowerShell includes a number of little–known but essential web–harvesting cmdlets, including Invoke–WebRequest, New–WebServiceProxy, Invoke–RestMethod, and Select–XML, to name just a few. Before they can help you automate your web data gathering and interpretation, though, you'll need some background in screen scraping, SOAP versus RESTful web services, XPath queries, forms, managing JSON data and the like. That could take a pretty long time, unless you attend this session created and delivered by best–selling tech author and speaker Mark Minasi. This fast–paced, fun session interweaves brief, "just enough information and no more" explanations of the underlying technologies with illustrative, real–world examples. Attend this session and you too can be The Master of Web Automation.

You will learn:

  • About PowerShell cmdlets that let you harvest web data sources, even if you know very little about PowerShell
  • How "screen scraping" with PowerShell's Invoke–Webrequest and regular expressions enable you to grab the one vital bit of info