<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Scientific Workflows | Matthew Brousil</title><link>https://brousil.science/tag/scientific-workflows/</link><atom:link href="https://brousil.science/tag/scientific-workflows/index.xml" rel="self" type="application/rss+xml"/><description>Scientific Workflows</description><generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Thu, 30 Mar 2023 00:00:00 +0000</lastBuildDate><image><url>https://brousil.science/media/icon_hu3f944acd3d6616c2d13f4cde577e2e67_445931_512x512_fill_lanczos_center_2.png</url><title>Scientific Workflows</title><link>https://brousil.science/tag/scientific-workflows/</link></image><item><title>targets for ecologists</title><link>https://brousil.science/project/targets/</link><pubDate>Thu, 30 Mar 2023 00:00:00 +0000</pubDate><guid>https://brousil.science/project/targets/</guid><description>&lt;p>&lt;em>Targets for ecologists&lt;/em> is a &lt;code>{bookdown}&lt;/code> resource showing how to use the &lt;code>{targets}&lt;/code> workflow management package to build research pipelines with R. It was adapted from two, three-hour workshops on the &lt;code>{drake}&lt;/code> package run in 2020 and 2021 through Washington State University’s CEREO. &lt;code>{drake}&lt;/code> is the predecessor to the current &lt;code>{targets}&lt;/code> package. The original workshop materials were roughly formatted for workshops styled after the Carpentries. The goal is to present these materials in a &lt;code>{bookdown}&lt;/code> format to enable ecologists to learn the basics of creating workflows with targets through a hands-on approach. This is a living document that may receive edits in the future.&lt;/p>
&lt;p>&lt;strong>Related links:&lt;/strong>&lt;/p>
&lt;ul>
&lt;li>&lt;a href="https://targets-ecology.netlify.app/" target="_blank" rel="noopener">{bookdown} website&lt;/a>&lt;/li>
&lt;li>&lt;a href="https://osf.io/gd8hf/" target="_blank" rel="noopener">OSF page&lt;/a>&lt;/li>
&lt;/ul></description></item><item><title>Global lake area, climate, and population dataset</title><link>https://brousil.science/project/glcp/</link><pubDate>Thu, 11 Jun 2020 00:00:00 +0000</pubDate><guid>https://brousil.science/project/glcp/</guid><description>&lt;p>The global lake area, climate, and population dataset (GLCP) is a dataset containing lake surface area for 1.42+ million lakes and reservoirs from 1995 to 2015 with basin-level temperature, precipitation, and population data.&lt;/p>
&lt;p>I joined the team working on this dataset in 2018 to support quality control, data management, and streamlining the workflow for the project. Over the course of our time putting the dataset together, I performed several main tasks:&lt;/p>
&lt;ol>
&lt;li>I reviewed the R scripts used in the workflow for accuracy and formatting; applied a consistent format and style to the scripts and their in-line documentation; and assisted with some parts of the R workflow such as data visualization, data subsetting, and mapping.&lt;/li>
&lt;li>I advised on how to improve the file structure and documentation (e.g. READMEs) for the project to ensure reproducibility.&lt;/li>
&lt;li>I helped build the QA &amp;amp; QC processes for the dataset.&lt;/li>
&lt;/ol>
&lt;p>The scripts for this dataset are publicly available at the Environmental Data Initiative, &lt;a href="https://portal.edirepository.org/nis/mapbrowse?packageid=edi.394.4" target="_blank" rel="noopener">here&lt;/a> as &lt;code>glcp_scripts.tar.gz&lt;/code>.&lt;/p></description></item><item><title>Workflow management with drake</title><link>https://brousil.science/project/drake/</link><pubDate>Fri, 20 Nov 2020 00:00:00 +0000</pubDate><guid>https://brousil.science/project/drake/</guid><description>&lt;p>In fall 2020 the &lt;a href="https://cereo.wsu.edu/" target="_blank" rel="noopener">Center for Environmental Research, Education and Outreach&lt;/a> at Washington State University hosted &lt;a href="https://mbrousil.github.io/workshops/2020-workshop-1" target="_blank" rel="noopener">a workshop&lt;/a> covering reproducible research techniques in R for graduate students. We wanted to cover &lt;code>drake&lt;/code> workflows as one day of the workshop to show students what R-specific options there are for managing workflows. We expected that workflow management and to some extent R functions would be unfamiliar topics to many students, so the workshop day included discussion of why one would use workflow management software and some basic examples of building realistic functions. At the time that we ran the workshop I wasn&amp;rsquo;t aware of &lt;a href="https://github.com/ropensci/targets" target="_blank" rel="noopener">&lt;code>targets&lt;/code>&lt;/a>, so in the future we may repurpose this example and replace &lt;code>drake&lt;/code> with that package.&lt;/p>
&lt;p>I put together an R Markdown document &lt;a href="https://brousil.science/uploads/drake_wkshp.html" target="_blank">here&lt;/a> where I&amp;rsquo;ve combined the contents from the intro presentation I used along with the lesson material in a single document. The (mostly) empty folder structure I provided for the walkthrough is &lt;a href="https://github.com/mbrousil/drake_template" target="_blank" rel="noopener">here&lt;/a> and an example of the finished product is in a repo &lt;a href="https://github.com/mbrousil/example_drake_project" target="_blank" rel="noopener">here&lt;/a>. We used Fanaee-T and Gama&amp;rsquo;s (2013) dataset on bike sharing in DC. More info &lt;a href="https://archive.ics.uci.edu/ml/datasets/Bike&amp;#43;Sharing&amp;#43;Dataset" target="_blank" rel="noopener">here&lt;/a>.&lt;/p>
&lt;p>Link to &lt;a href="https://cereo.wsu.edu/reproducible-r-workshop-2021s/" target="_blank" rel="noopener">workshop website&lt;/a>.&lt;/p></description></item></channel></rss>