<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Data Science | Matthew Brousil</title><link>https://brousil.science/tag/data-science/</link><atom:link href="https://brousil.science/tag/data-science/index.xml" rel="self" type="application/rss+xml"/><description>Data Science</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Thu, 30 Mar 2023 00:00:00 +0000</lastBuildDate><image><url>https://brousil.science/media/icon_hu3f944acd3d6616c2d13f4cde577e2e67_445931_512x512_fill_lanczos_center_2.png</url><title>Data Science</title><link>https://brousil.science/tag/data-science/</link></image><item><title>targets for ecologists</title><link>https://brousil.science/project/targets/</link><pubDate>Thu, 30 Mar 2023 00:00:00 +0000</pubDate><guid>https://brousil.science/project/targets/</guid><description>&lt;p>&lt;em>Targets for ecologists&lt;/em> is a &lt;code>{bookdown}&lt;/code> resource showing how to use the &lt;code>{targets}&lt;/code> workflow management package to build research pipelines with R. It was adapted from two, three-hour workshops on the &lt;code>{drake}&lt;/code> package run in 2020 and 2021 through Washington State University’s CEREO. &lt;code>{drake}&lt;/code> is the predecessor to the current &lt;code>{targets}&lt;/code> package. The original workshop materials were roughly formatted for workshops styled after the Carpentries. The goal is to present these materials in a &lt;code>{bookdown}&lt;/code> format to enable ecologists to learn the basics of creating workflows with targets through a hands-on approach. This is a living document that may receive edits in the future.&lt;/p>
&lt;p>&lt;strong>Related links:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://targets-ecology.netlify.app/" target="_blank" rel="noopener">{bookdown} website&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://osf.io/gd8hf/" target="_blank" rel="noopener">OSF page&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>Global lake area, climate, and population dataset</title><link>https://brousil.science/project/glcp/</link><pubDate>Thu, 11 Jun 2020 00:00:00 +0000</pubDate><guid>https://brousil.science/project/glcp/</guid><description>&lt;p>The global lake area, climate, and population dataset (GLCP) is a dataset containing lake surface area for 1.42+ million lakes and reservoirs from 1995 to 2015 with basin-level temperature, precipitation, and population data.&lt;/p>
&lt;p>I joined the team working on this dataset in 2018 to support quality control, data management, and streamlining the workflow for the project. Over the course of our time putting the dataset together, I performed several main tasks:&lt;/p>
&lt;ol>
&lt;li>I reviewed the R scripts used in the workflow for accuracy and formatting; applied a consistent format and style to the scripts and their in-line documentation; and assisted with some parts of the R workflow such as data visualization, data subsetting, and mapping.&lt;/li>
&lt;li>I advised on how to improve the file structure and documentation (e.g. READMEs) for the project to ensure reproducibility.&lt;/li>
&lt;li>I helped build the QA &amp;amp; QC processes for the dataset.&lt;/li>
&lt;/ol>
&lt;p>The scripts for this dataset are publicly available at the Environmental Data Initiative, &lt;a href="https://portal.edirepository.org/nis/mapbrowse?packageid=edi.394.4" target="_blank" rel="noopener">here&lt;/a> as &lt;code>glcp_scripts.tar.gz&lt;/code>.&lt;/p></description></item><item><title>Dungeons and Dragons Simulation with Shiny</title><link>https://brousil.science/project/5e_sim/</link><pubDate>Thu, 02 Jul 2020 00:00:00 +0000</pubDate><guid>https://brousil.science/project/5e_sim/</guid><description>&lt;p>During grad school I was playing in a &lt;a href="https://en.wikipedia.org/wiki/Pathfinder_Roleplaying_Game" target="_blank" rel="noopener">Pathfinder&lt;/a> game with some friends and found myself wondering whether a new item my character had received was something I should use or &lt;em>sell&lt;/em>. The new sword&amp;rsquo;s stats were similar enough to the one my character was currently using that it wasn&amp;rsquo;t clear to me which one was better. I was still fairly new to R at the time, so I used this as an opportunity to test out &lt;code>for&lt;/code> loops, &lt;code>if&lt;/code> statements, and functions in more depth.&lt;/p>
&lt;p>I ended up writing a script that accepted inputs that a typical Pathfinder or Dungeons and Dragons (D&amp;amp;D) player would be actively using: Statistics about their character, the opposing character (or monster!), the number of dice they roll when they attack, and so on.&lt;/p>
&lt;p>In the short term I found out that the new weapon was better, but this example has served as an ongoing learning experience for me outside of gameplay. I eventually updated the script when I started playing with the new rules of Dungeons and Dragons 5th edition (5e), and then again when I wanted to dig into building Shiny apps with R.&lt;/p>
&lt;p>I now have two versions of this as a Shiny app on shinyapps.io:&lt;/p>
&lt;ul>
&lt;li>My standard version built for a character of the barbarian class in D&amp;amp;D 5th Edition. Test it &lt;a href="http://cactusoxbird.shinyapps.io/dd-shiny-sim" target="_blank" rel="noopener">here&lt;/a>.&lt;/li>
&lt;li>A rewrite of the standard version using the &lt;code>purrr&lt;/code> package to change how simulation runs are handled. This is slower than the original version of the app, but I found it to be a useful case for understanding how the &lt;code>map()&lt;/code> family can be put to use for simulation. Test it &lt;a href="https://cactusoxbird.shinyapps.io/dd-shiny-sim-purrr/" target="_blank" rel="noopener">here&lt;/a>.&lt;/li>
&lt;/ul>
&lt;p>Both versions of the app provide a couple of pieces of output for the inquisitive D&amp;amp;D player:&lt;/p>
&lt;ul>
&lt;li>A histogram of the total damage done by the character for each of the simulation runs&lt;/li>
&lt;li>Lines on the histogram and printed text showing the mean and median damage done over the course of the simulated runs&lt;/li>
&lt;li>A tab containing a table of hit statistics: All of the amounts of damage their character did over the simulation runs, the raw number of times each amount of damage was done, and the percentage of the time that amount of damage was done&lt;/li>
&lt;/ul>
&lt;p>These outputs were helpful as a Pathfinder/D&amp;amp;D player, and as I reviewed them I ended up learning more than that my character&amp;rsquo;s new sword was an improvement. For example, what I had not realized while throwing dice down on the table was &lt;em>just how frequently&lt;/em> a character misses. Using the current base settings for the D&amp;amp;D 5e barbarian simulation shows that a character should expect to miss ~30% of the time they take a swing. This might be dismaying, but also could be a bit of a confidence boost for anyone who feels like they have bad luck a lot.&lt;/p>
&lt;p>You can find the scripts for this app on GitHub &lt;a href="https://github.com/mbrousil/5e-shiny-sim" target="_blank" rel="noopener">here&lt;/a>.&lt;/p></description></item></channel></rss>