What is a String?
Strings are one of the core data types that we use when we’re coding. Every time you want to represent words or text, that’s a string.
That last paragraph was a string. So is this one!
Generally speaking, we’re going to represent a string something like this:
"This is a string. Yay!"
Depending on your programming language, a string can either be a primitive type of an Object, mutable or immutable. But all strings are composed of a sequence of characters.
These characters can be ASCII or Unicode, depending on how we choose to do our encoding.
Essentially all that boils down to is how many characters we have available to us. Since ASCII is only 256 characters, it can be represented by a single byte.
Java Strings, C Strings, and Python Strings
In this section, we’re going to break down strings by programming language. If there is a specific language that you’re planning on using for your interview, I recommend that you focus there.
Each language differs quite a bit, so let’s get into it!
Java Strings
Fast Facts:
- Mutable? No
- Primitive? No
- Comparison: s1.equals(s2)
- Access the ith character: s1.charAt(i)
When using strings in Java, we need to be aware of several key points that make Java strings unique from strings in other languages.
For starters, strings in Java are actually an Object type and not a primitive. This means that, while we often think of them as a primitive type, strings actually have the properties of Objects more than they do of primitives.
For example, when comparing strings in Java, it is best to use the equals() method rather than ==. This is important because == does not actually compare the value of a string, like it would with a primitive type. Rather, it compares the Object pointers to see if two strings are actually the same object.
equals() is preferred because it will make sure that we are actually comparing the values of two strings, rather than the pointers.
The other thing that we need to be very aware of in Java is that strings are immutable. That means that every time we “modify” a string, we are actually creating a copy.
So what does that mean for us? Well for one, doing something like String s2 = s1 + "z" is not a constant-time operation. Rather it takes time proportional to the length of s1 because we have to copy the entirety of s1 into a new object.
To avoid time complexity issues like this, Java provides us with a StringBuilder class that we will want to use if we are constantly modifying or appending on to a string. A StringBuilder is really a wrapper for an array of characters that provides us with an easy toString() method that we can use to recover a string.
Useful Java String Methods:
- length() – Returns the length of the string
- charAt(int i) – Returns the character at index i
- substring(int i, int j) – Returns the substring from i to j (inclusive of index i and exclusive of index j)
- contains(String s) – Returns a True if s is contained in the string
- indexOf(String s) – Returns the starting index of the first occurrence of s
- toArray() – Converts a string to a character array (useful if we want to repeatedly modify the string)
C Strings
Fast Facts:
- Mutable? Yes
- Primitive? No
- Comparison: strcmp(s1, s2)
- Access the ith character: s1[i]
C is uniquely different from Java in the sense that C strings are nothing more than simple character arrays that are terminated with a null character. This affords us some advantages as well as some disadvantages.
In the pro category, strings in C are mutable. Since it’s just an array it’s easy for us to access any of the characters, modify them, and whatnot. This means that certain types of problems that we might want to do become super easy.
On the flip side, though, it makes a lot of things more difficult. Because strings are not a class of their own, we can’t easily do any operations or comparisons on the strings directly, so we end up having to rely on library functions for basically everything.
Since they are arrays, we also have to allocate the entire size of the string up front or risk having to copy all of the data. We also have to make sure that we are null-terminated or we can run into issues.
Useful C String Functions
- strstr(char *s1, char *s2) – Returns a pointer to the beginning of s1 if found in s2
- strcat(char *s1, char *s2) – Concatenates 2 strings
- strcpy(char *s1, char *s2) – Copies the contents of s1 to s2
- strlen(char *s1) – Returns the length of a string
C++ Strings
Fast Facts:
- Mutable? No
- Primitive? No
- Comparison:s1.compare(s2)
- Access the ith character: s1[i]
Unlike in C, C++ allows the use of std::string which is part of the STL (standard template library). So long as you ensure to #include<string> in your program, you have access to this library and can treat strings in a very similar manner to how they are treated in Java.
Similar to Java, strings in C++ are treated as objects as opposed to primitives. We see this when we compare two strings by calling s1.compare(s2). However, the standard == operator is overloaded in C++ for std::string to allow for string comparison as well. Convention dictates which of the two are used, but both produce the same result.
Another similarity to Java strings is that C++ strings are immutable. When we alter a string in C++, we are creating a copy of that object. The same pitfalls and caveats we covered for Java strings are applicable here then as well.
Useful C++ String Methods:
- s1.length() – Returns the length of the string (from string::length)
- s1.find(s2) – Returns the index of s1 in the string s2 (from string::find)
- strcpy(char_array, s1.c_str()) – Converts s1 into a character array
- s1.substr(i,j) – Get the substring of s1 from i with length j
Python Strings
Fast Facts:
- Mutable? No
- Primitive? Yes
- Comparison: s1 == s2
- Access the ith character: s1[i]
In typical Python fashion, this is probably the easiest of these three languages to handle strings. Rather than having to use a lot of function calls, much of what we might want to do is built into the language, such as getting substrings of a string.
However, you should use caution when using Python for one very important reason: The simplicity of the language often masks underlying complexity.
For example, we can very easily get a substring of a string in Python using the Python slicing syntax: s[1:5]. Writing it this way makes it very easy to assume that it is a constant-time operation. However, in most cases, operations like this are still going to copy the substring (remember strings are immutable), meaning that it is going to take us linear time.
Useful Python String Methods:
- len() – Returns the length of the string
- s1 in s2 – Is s1 a substring of s2
- index(s1) – Returns the index of s1 in the string
- list(s1) – Converts s1 into a character array
- s1[i:j] – Get the substring of s1 from i to j
0 Comments