Thanks Neal I'll check it out. And hey, I have no issue with easy if it helps solves a problem :)
Thanks again,
Elaine
Original Message:
Sent: 10/9/2023 6:48:00 PM
From: Neal Fultz
Subject: RE: Primitive R -search tool, Consulting section community
Hi Elaine, I'd done a talk on web scraping a couple years ago - http://bit.ly/ssc_scraping - but it wasn't recorded so there's only slides now. It was for social scientists so it may be too basic for you.
It does reference a couple packages / techniques that may be helpful.
You'll just have to match your scraper up to the data source, if it's a clean data source it might be import directly into Google Sheets automatically or be only one or two lines of code, on the other hand, if it's a complicated, you'll basically have to reverse-engineer the web site.
Just don't do anything that would get you sued and remember https://xkcd.com/1205/ and you should be good.
Best,
Neal
Original Message:
Sent: 10-09-2023 18:32
From: Elaine Eisenbeisz
Subject: Primitive R -search tool, Consulting section community
I like the idea of a Slack/Discord space. Like a virtual water cooler we can hang out at and talk shop.
Neal, I'm working on a project now where I'm trying to scrape a project file and make reports for the leadership team. Do you have programs or helpful literature to share on something like that? Thanks in advance.
Best,
Elaine
------------------------------
Elaine Eisenbeisz
Owner and Principal Statistician
Omega Statistics
Original Message:
Sent: 10-09-2023 14:01
From: Neal Fultz
Subject: Primitive R -search tool, Consulting section community
Re scraping - I had scraped the entire CNSL list during the pandemic when Michiko was leading the getting-started-guide-group, haven't updated it in a couple years. I thought the word cloud turned out nicely:
I have a mild preference for slack over discord just because discord is so aggressive at collecting PII.
-Neal
Original Message:
Sent: 10-09-2023 13:39
From: Adam Batten
Subject: Primitive R -search tool, Consulting section community
Thanks for sharing Chris! I've been toying with the idea of setting up a Slack/Discord channel for this community. Anyone interested in that?
Cheers,
AB
------------------------------
Adam Batten
Lead Statistician & President AB Analytics LLC
AB EVERGREEN ANALYTICS LLC
Original Message:
Sent: 10-08-2023 18:05
From: Chris Barker
Subject: Primitive R -search tool, Consulting section community
Currently there are very limited tools to search for old posts in the Consulting or any ASA section community. Easiest to simply scroll back through many posts. Primitive and Not a very practical search procedure.
My thanks to a section member and volunteer, Andrew Pua. He used R to create a report in excel ( I attached both the excel .xlsx and CSV format). And credit to a previous section chair for the idea, my apology for not recalling his name . That previous chair suggested the "...wouldn't it be nice if..." someone could extract posts from the community" and organize in a spreadsheet to simplify searching old posts. For example "wouldn't it be nice" if there was an easy way to assemble all previous posts about finding insurance for a consulting practice.
The attached is a "web scraping" of the first 700 community postings prepared in R . The output in two formats as above. The file lists the topic title and a link to the post. Have the community section open in order to test the links. Not included , are the comments or discussion thread related to that topic.
There may be easy ways to improve this search tool. I'll make Andrew's R script/code available on the section website and available for anyone who would like to do the web scraping on their own. And I encourage anyone who can improve the tool and make it available for the section members who wish to search old posts - that will be also greatly appreciated.
-Chris
------------------------------
Chris Barker, Ph.D.
2023 Chair Statistical Consulting Section
Consultant and
Adjunct Associate Professor of Biostatistics
www.barkerstats.com
---
"In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
-Steve Lacy
------------------------------