Discussion: View Thread

Primitive R -search tool, Consulting section community

  • 1.  Primitive R -search tool, Consulting section community

    Posted 10-08-2023 18:05

    Currently there are very limited tools to search for old posts in the Consulting or any ASA section  community. Easiest to simply scroll back through many posts. Primitive and Not a very practical search procedure.

    My thanks to a section member and volunteer, Andrew Pua. He used  R to create a report in excel ( I attached both the excel .xlsx  and CSV format). And credit to a previous section chair for the idea, my apology for not recalling his name . That previous chair suggested the "...wouldn't it be nice if..." someone could extract posts from the community" and organize in a spreadsheet to simplify searching old posts. For example "wouldn't it be nice" if there was an easy way to assemble all previous posts about finding insurance for a consulting practice.

    The attached is a "web scraping" of the first 700 community postings prepared in R . The output in two formats as above. The file lists the topic title and a link to the post. Have the community section open in order to test the links. Not included , are  the comments or discussion  thread related to that topic. 

    There may be easy ways to improve this search tool. I'll make Andrew's R script/code available on the section website and available for anyone who would like to do the web scraping on their own. And I encourage anyone who can improve the tool and make it available for the section members who wish to search old posts - that will be also greatly appreciated.

    -Chris



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------

    Attachment(s)

    csv
    first-700-topics.csv   153 KB 1 version
    xlsx
    first-700-topicsXLSX.xlsx   65 KB 1 version


  • 2.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 13:40

    Thanks for sharing Chris! I've been toying with the idea of setting up a Slack/Discord channel for this community. Anyone interested in that?

    Cheers,

    AB



    ------------------------------
    Adam Batten
    Lead Statistician & President AB Analytics LLC
    AB EVERGREEN ANALYTICS LLC
    ------------------------------



  • 3.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:02

    Re scraping - I had scraped the entire CNSL list during the pandemic when Michiko was leading the getting-started-guide-group, haven't updated it in a couple years. I thought the word cloud turned out nicely:

    I have a mild preference for slack over discord just because discord is so aggressive at collecting PII. 

    -Neal




  • 4.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:41

    Neal, thank you. I'll post the R code in the section repository. 

    is there a way to set up some kind of R GUI for accessing/searching past posts? or is that something that would require paying $ for an computer account ?



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------



  • 5.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 18:32

    I like the idea of a Slack/Discord space. Like a virtual water cooler we can hang out at and talk shop.

    Neal, I'm working on a project now where I'm trying to scrape a project file and make reports for the leadership team. Do you have programs or helpful literature to share on something like that? Thanks in advance.

    Best,

    Elaine



    ------------------------------
    Elaine Eisenbeisz
    Owner and Principal Statistician
    Omega Statistics
    ------------------------------



  • 6.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 18:48

    Hi Elaine, I'd done a talk on web scraping a couple years ago - http://bit.ly/ssc_scraping - but it wasn't recorded so there's only slides now. It was for social scientists so it may be too basic for you.

    It does reference a couple packages / techniques that may be helpful. 

    You'll just have to match your scraper up to the data source, if it's a clean data source it might be import directly into Google Sheets automatically or be only one or two lines of code, on the other hand, if it's a complicated, you'll basically have to reverse-engineer the web site.

    Just don't do anything that would get you sued and remember https://xkcd.com/1205/ and you should be good.

    Best,

    Neal




  • 7.  RE: Primitive R -search tool, Consulting section community

    Posted 10-10-2023 08:35
    Thanks Neal I'll check it out. And hey, I have no issue with easy if it helps solves a problem :) 
    Thanks again,
    Elaine 

    Sent via the Samsung Galaxy Z Flip3 5G, an AT&T 5G smartphone
    Get Outlook for Android





  • 8.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:13
    I'd vote for Slack.

    Jane






  • 9.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:26
    Slack is my preference. 
    Thanks for bringing this up!
    Best,
    Glen






  • 10.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:36

    Thx. I have been testing a slack account , with the current section officers. and I have used slack in previous consulting projects. 

    I don't know if or how useful slack would be for the section members and especially if there are redundancies with the community webpage. 

     Should there be interest in testing slack, then it should be possible to request access to my slack account at chris.barker@barkerstats.com . I created a Statistical Consulting Section group and a subgroup for the Section officers.



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------



  • 11.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:40
    Thanks, Chris!
    By "group" do you mean there's a channel for offices and a channel for the whole section?
    Using separate Slack channels seems like a good way to manage this.
    Given the posting frequency, I suspect that most of our needs could be met this way.
    Best,
    Glen






  • 12.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:49

    I'm not particularly slack object knowledgeable. I attached a screenshot(s) of whatever slack calls it. :) I think the slack term is workspace. And it appears I can change the name of the URL to something easy to type



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------



  • 13.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 14:54
    Great! Yep so if you just send out the URL for the Slack Workspace then people can request to join & you can let us in.
    I'd be happy to help with some "light" Slack etiquette instructions for anyone who's new to Slack. (Ex. a Slack etiquette read-me that we can pin to the main channel).






  • 14.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 15:09

    I created a slack channel (?)

    .. https://asa-stat-cnsl-section.slack.com

    and screen shots attached



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------



  • 15.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 15:11

    and a correction to my note . I created the channel statistical-consulting-section-members



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------



  • 16.  RE: Primitive R -search tool, Consulting section community

    Posted 10-09-2023 15:44
    Hey Chris,
    Thanks for this! It looks like we need to be individually invited to the working space. I.e., the Slack admin puts in the invitee's email address. I guess it's not like google drive, where you can request access by knowing the URL.

    I've sent you my email via your consulting page. Could you please invite that respective email? 
    I'm not sure if there's a less-manual way to give members access. 

    Thanks again,
    Glen






  • 17.  RE: Primitive R -search tool, Consulting section community

    Posted 10-10-2023 19:29

    Hi Glenn,there may be a glitch on my webpage, unfortunately I didn't receive your email. you're welcome  to email me directly at chris.barker@barkerstats.com  . I agree that so far,  each individual needs to be added one at a time. there may be a way to include multiple email addresses in the "add " option. .

    There is a module ($)  "slack connect" which seems to simplify the process of adding people to the slack channel. However, I didn't find the price of slack connect on the website, only instructions for calling the slack sales team. I'm skeptical this is a modest licensing fee



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------