Chapter 8 Hosting
Once you have developed your Plumber API, the next step is to find a way to host it. If you haven’t dealt with hosting an application on a server before, you may be tempted to run the run() command from an interactive session on your development machine (either your personal desktop or an RStudio Server instance) and direct traffic there. This is a dangerous idea for a number of reasons:
- Your development machine likely has a dynamic IP address. This means that clients may be able to reach you at that address today, but it will likely break on you in the coming weeks/months.
- Networks may leverage firewalls to block incoming traffic to certain networks and machines. Again, it may appear that everything is working for you locally, but other users elsewhere in the network or external clients may not be able to connect to your development machine.
- If your Plumber process crashes (for instance, due to your server running out of memory), the method of running Plumber will not automatically restart the crashed service for you. This means that your API will be offline until you manually login and restart it. Likewise if your development machine gets rebooted, your API will not automatically be started when the machine comes back online.
- This technique relies on having your clients specify a port number manually. Non-technical users may be tripped up by this; some of the other techniques do not require clients specifying the port for an API.
- This approach will eternally run one R process for your API. Some of the other approaches will allow you to load-balance traffic between multiple R processes to handle more requests. RStudio Connect will even dynamically scale the number of running processes for you so that your API isn’t consuming more system resources than is necessary.
- Most importantly, serving public requests from your development environment can be a security hazard. Ideally, you should separate your development instances from the servers that are accessible by others.
For these reasons and more, you should consider setting up a separate server on which you can host your Plumber APIs. There are a variety of options that you can consider.
8.1 DigitalOcean
DigitalOcean is an easy-to-use Cloud Computing provider. They offer a simple way to spin up a Linux virtual machine and access it remotely. You can choose what size machine you want to run – with options ranging from small machines with 512MB of RAM for a few dollars a month up to large machines with dozens of GB of RAM – and only pay for it while it’s online.
Plumber includes helper functions that enable you to automatically provision a Plumber server and deploy your APIs to it. So in order to setup a Plumber server running on DigitalOcean, you’ll follow these steps:
- Create a DigitalOcean account.
- Setup an SSH key and deploy the public portion to DigitalOcean so you’ll be able to login to your server.
- Install the
analogseaR package and run a test command likeanalogsea::droplets()to confirm that it’s able to connect to your DigitalOcean account. - Run
mydrop <- plumber::do_provision(). This will start a virtual machine (or “droplet”, as DigitalOcean calls them) and install Plumber and all the necessary prerequisite software. Once the provisioning is complete, you should be able to access port8000on your server’s IP and see a response from Plumber. - You can use
plumber::do_deploy_api()to deploy or update your own custom APIs to a particular port on your server. - (Optional) Setup a domain name for your Plumber server so you can use www.myplumberserver.com instead of the server’s IP address.
- (Optional) Configure SSL
Getting everything connected the first time can be a bit of work, but once you have analogsea connected to your DigitalOcean account, you’re not able to spin up new Plumber servers in DigitalOcean hosting your APIs with just a couple of R commands. You can even write scripts that provision an entire Plumber server with multiple APIs associated.
8.2 RStudio Connect
RStudio Connect is an enterprise publishing platform from RStudio. It supports push-button publishing from the RStudio IDE of a variety of R content types including Plumber APIs.
RStudio Connect automatically manages the number of R processes necessary to handle the current load and balances incoming traffic across all available processes. It can also shut down idle processes when they’re not in use. This allows you to run the appropriate number of R processes to scale your capacity to accommodate the current load.
Conflict of interest: the primary author of plumber and this book works for RStudio on RStudio Connect.