Ruby is a very versatile language. It combines the simplicity of an elegant syntax with powerful features such as support (and encouragement) for monkey patching. Thanks to the popularity of the Rails framework, Ruby is one of the top ten languages in use today. With so many people writing Ruby code, there are a lot of common mistakes that developers could make with serious security implications.
From its inception Ruby has, quite rightly so, built a reputation for being packed with features focused on developer happiness. A famous quote about Ruby, from the Rails doctrine, goes a little something like this.
Ruby includes a lot of sharp knives in its drawer of features. There’s nothing programmatically in Ruby to stop you using its sharp knives to cut ties with reason.
That’s a fairly accurate assessment! When you provide sharp knives, it’s only a matter of time before sooner or later, someone ends up seriously hurt.
Let’s take a look at some of the common ways Ruby, and Rails, developers can do a lot of damage by overlooking some important security considerations. We’ll also see how that damage can be prevented with some minor tweaks and adjustments to the code.
It’s always a bad idea to execute anything that the user might have submitted. Serializing and deserializing user input does not seem like such a bad idea because no code is being run. But it is! In fact, unsafe deserialization is one of the OWASP Top Ten, a basic checklist for web security.
Ruby’s built-in YAML library, based on Psych, has support for serializing custom data types to YAML and back. See this serialization code here and the YAML it produces.
# serialize.rb
require 'yaml'
require 'set'
s = Set.new([1, 2, 3])
File.open('set.yml', 'w') do |file|
YAML.dump(s, file)
end
--- !ruby/object:Set
hash:
1: true
2: true
3: true
Deserializing this YAML gives back the original data type.
# deserialize.rb
require 'yaml'
require 'set'
file = File.read('set.yml')
s = YAML.load(file)
p s
$ ruby deserialize.rb
#<Set: {1, 2, 3}>
As you can see, the line -- !ruby/object:Set
in the YAML describes how to re-instantiate objects from their text representations. But this opens up a slew of attack vectors, that can escalate to RCE when this instantiation can execute code.
Solution
The solution is to use safe-loading. It’s a very small change, just using the YAML::safe_load
function instead of YAML::load
, but it completely blocks loading of custom classes.
# deserialize.rb
require 'yaml'
require 'set'
file = File.read('set.yml')
s = YAML.safe_load(file)
p s
$ ruby deserialize.rb
/usr/ruby/2.7.0/psych/class_loader.rb:97:in `find':
Tried to load unspecified class: Set (Psych::DisallowedClass)
Standard types like hashes and arrays can still be serialized to and deserialized from YAML documents just like before, which is mostly what you or most people wanted to do in the first place.
---
hash:
1: true
2: true
3: true
$ ruby deserialize.rb
{"hash"=>{1=>true, 2=>true, 3=>true}}
If your app needs to read or write user-specified files from the disk or make API requests to user-specified URLs, you might be familiar with code that looks like this.
# test.rb
puts 'Enter file path:'
path = gets.chomp
puts 'File contents:'
open(path, 'r') do |file|
until file.eof? do
puts file.gets
end
end
This looks fairly innocuous, right? Ask the user for a file name and print out its contents line-by-line till the end of the file. And it works as you would expect too! But if you don’t see the problem, you likely didn’t read the documentation thoroughly enough.
You see, the Kernel::open
function that we used in the code snippet above has one additional feature, it can also be used to spawn processes and pipe output from them. See what happens when you pass an input that starts with the pipe (|
) character.
$ ruby test.rb
Enter file path:
|date
File contents:
Sun Nov 15 22:03:33 IST 2020
This executes the date
command and pipes the output from that command back to Ruby, which reads it just like it would read a file. Of course a malicious user would be executing commands much more destructive than date
. This is basically RCE, a remote user executing code on your servers, with the full privileges available to your web-app.
At the risk of sounding repetitive, never ever run code that comes from an untrusted source such as a user but if you have to, you can at the very least limit their access to the resources that you’ve chosen.
Solution
Never use Kernel::open
. The better alternative is to use File::open
or URL::open
or IO::open
, whichever the use may be. Let’s try this again, this time with File::open
instead of Kernel::open
and see how it works differently. It’s only slightly different but so much more secure. No more access to shell commands.
# test.rb
puts 'Enter file path:'
path = gets.chomp
puts 'File contents:'
File.open(path, 'r') do |file|
until file.eof? do
puts file.gets
end
end
$ ruby test.rb
Enter file path:
|date
File contents:
Traceback (most recent call last):
2: from test.rb:4:in `<main>'
1: from test.rb:4:in `open'
test.rb:4:in `initialize': No such file or directory @ rb_sysopen - |date (Errno::ENOENT)
We’ve all heard of SQL injection. If you’ve been living under a rock, this is what it looks like. It’s what happens when you directly use user input from the frontend, without sanitization, in an SQL query on the backend.
Let’s say you have a column users in which you’re searching by name entered in a form. Here is how you would do this in Ruby.
User.where("name = '#{name}'")
But this is very very unsafe. Consider this malicious input from a nefarious user.
name = " ' OR '1' = '1" # Malicious input
The query, as seen below on the first line, becomes faulty due to the malicious input, as seen below on the second line. The WHERE
clause is now always true and gives you every user in the table.
SELECT * FROM users WHERE name = '<name>' -- <name> comes from the user
SELECT * FROM users WHERE name = ' ' OR '1' = '1' -- Faulty query
As you can see this is a pretty severe error and another one of OWASP Top Ten.
Solution
The solution is again very simple. Use one of these two ways.
User.where(["name = ?", name])
User.where({ name: name })
Both of these approaches are designed to sanitize the value before generating the query and are therefore impervious to this attack.
When you create a Rails model, it creates the id
field for you. It’s generally an auto-incrementing integer field. For the most part, this is a sufficient implementation. Integer IDs have the advantage of being simple and since they keep incrementing, there’s no chance of collisions. But what they offer in simplicity, they lose in security.
These are not hypothetical scenarios by the way. There are many sites that just do this, and consequently many sites have in the past exposed sensitive personal information to people who tried changing the numbers in the URL.
Solution
If you’re building a web application in 2020, use UUIDs. Rails makes this way too easy. With this single change, your models will all use UUIDs as primary keys.
# config/application.rb
config.generators do |g|
g.orm :active_record, primary_key_type: :uuid
end
Or if you want to use integers, use large ones that are chosen at random. Let’s take a look at YouTube videos, more specifically let’s look at the URL structure of YouTube videos. This is what a YouTube video URL looks like.
https://www.youtube.com/watch?v=dQw4w9WgXcQ
The string of seemingly random letters at the end, that’s the video ID. It’s basically a random number between $0$ and $64^{11}$ (yes, that’s over 73 quintillion numbers!) encoded in Base64. Not exactly a UUID, but not an auto-incrementing field integer field either. Why doesn’t YouTube use simple integer IDs that go 1, 2, 3… and so on? Well, now you know why.
Yes, Ruby’s power of expression makes it easy to introduce security issues but the solution is not to switch off these features or migrate to a different language. This brings us back to the Rails doctrine quote from earlier. The very next sentence to that quote is this.
We enforce such good senses by convention, by nudges, and through education. Not by banning sharp knives from the kitchen and insisting everyone use spoons to slice tomatoes.
The key takeaway from all of these examples is to never trust your users. User provided input should not be serialized-deserialized, evaluated or rendered directly. The only right way to be safe is to be careful of what you write and thoroughly audit what you have written.
Static Code Analysis tools such as linters and vulnerability scanners can help you find a lot of issues before they get exploiting in the open. A good linter like RuboCop can find and notify you of security problems like unsafe deserialization and IO hijacking, and can offer suggestions as to possible fixes. Brakeman, a popular vulnerability scanner can find SQL injection among a laundry list of other possible security lapses. Both of these are free and open source tools that you should absolutely be using in your development and CI workflows.
You should also consider automating this entire audit and review process using code review automation tools like DeepSource that scans your code, on every commit and for every PR, through it’s linters and security analyzers and can automatically fix a multitude of issues. DeepSource also has its own custom-built analyzers for most languages that are constantly improved and kept up-to-date. And it’s incredibly easy to set up!
version = 1
[[analyzers]]
name = "ruby"
enabled = true
[[transformers]]
name = "rubocop"
enabled = true
Who knew it could be so simple?
Have fun slicing tomatoes, and be careful not to hurt your fingers in the process!