Welcome to Dream.In.Code
Getting Java Help is Easy!

Join 136,592 Java Programmers for FREE! Get instant access to thousands of Java experts, tutorials, code snippets, and more! There are 2,134 people online right now. Registration is fast and FREE... Join Now!




String Tutorial

2 Pages V  1 2 >  
Reply to this topicStart new topic

> String Tutorial, Descriptions, Explanations and Examples

Rating  3
William_Wilson
Group Icon



post 15 Sep, 2007 - 05:28 PM
Post #1


Java – Strings


-----------------------------------------------------------------------------------------------------------
Table of Contents:
*Introduction
*Constructors
-Code Examples of Constructors
*String Information
*Methods
-Code Examples of Methods
*toString()
*More Complex (Useful) Examples
-----------------------------------------------------------------------------------------------------------



Introduction:
Every Java programmer who has written any code has come in contact with the String variable:
public static void main(String args[])
or
public static void main(String[] args)

These both are stating that this method accepts an array of Strings as input.

If you’ve programmed in any language, you come in contact with Strings in some form.

CODE

    public void chars()
    {
        char text[] = {'T','h','i','s',' ','i','s',' ','a','n',' ','a','r','r','a','y',' ','o','f',' ','c','h','a','r'};
        System.out.println(text);
    }
    
    public void string()
    {
        String text = "This is a String";
        System.out.println(text);
    }

These two methods will print in the same way. Knowing this is important when using some of the String methods that are available.

Constructors:
*Examples of most methods will be included after the set of descriptions.
**All depreciated methods have been left out, as of www.java.sun.com documentation version 6.

String()
-Creates an empty String, which is essentially a single char as the null terminating character ‘\0’

String(byte[] bytes)
-Creates a String from the array of bytes supplied. Any size of byte array can be used. The default character set will be used.

String(byte[] bytes)
-Creates a String from the array of bytes supplied. Any size of byte array can be used. The supplied character set is used, based on its name, rather than object.

String(byte[] bytes, Charset charset)
-Creates a String from the array of bytes supplied. Any size of byte array can be used. The supplied character set will be used.

String(byte[] bytes, int offset, int length)
-Creates a String of length length, starting at value offset in the byte array. The default character set is used.

String(byte[] bytes, int offset, int length, Charset charset)
-Creates a String of the length length, starting at value offset in the byte array. The supplied character set is used.

String(byte[] bytes, int offset, int length, String charsetName)
-Creates a String of the length length, starting at value offset in the byte array. The supplied character set is used, based on its name, rather than object.

String(char[] value)
-Creates a String from the supplied character array.

String(char[] value , int offset, int count)
- Creates a String of length length, starting at value offset in the char array

String(int[] codePoints, int offset, int count)
-Allocates memory for a String which contains Unicode characters of the type specified by codePoints.

String(String original)
-Creates a new String object which contains the same characters as the original String.

String(StringBuffer buffer)
-Creates a new String which contains the same characters which were contained in the supplied buffer.

String(StringBuilder builder)
-Creates a new String which contains the same characters which were contained in the supplied builder.


Examples:
CODE

    public void empty_string()
    {
        String string = new String();
        System.out.println(string);
        string = "Some Text";
        System.out.println(string);
    }

The string was created using the empty constructor, this is proven by then adding some characters to the string.

CODE

    public void byte1()
    {
        byte[] bytes = {'A',' ','B',' ','C'};
        System.out.println(bytes[0]);
        String text = new String(bytes);
        System.out.println(text);
    }

You might expect the output to be:
A
A B C
But you would be mistaken. Bytes treat values differently, the actual output is:
65
A B C
The constructor changes the 65 ASCII value into part of a String.

CODE

    public void byte2()
    {
        byte[] bytes = {'A',' ','B',' ','C'};
        System.out.println(bytes[0]);
        try
        {
            String text = new String(bytes,"ISO-8859-1");
            System.out.println(text);
            text = new String(bytes,"UTF-16BE"); //incompatible charset
            System.out.println(text);
            text = new String(bytes,"UTF-16LE"); //incompatible charset
            System.out.println(text);
        }
        catch(Exception e) {}
    }

I have skipped over the other option for using Charset on purpose. This method has been proven to be more efficient and produce results faster than actually supplying a Charset Object. The bug has been logged, but is yet to be resolved.

CODE

    public void byte3()
    {
        byte[] bytes = {'G',' ','H',' ','I'};
        System.out.println(bytes[0]);
        try
        {
            String text = new String(bytes,0,1,"ISO-8859-1");
            System.out.println(text);
            text = new String(bytes,0,1,"UTF-16BE"); //incompatible charset
            System.out.println(text);
            text = new String(bytes,0,1,"UTF-16LE"); //incompatible charset
            System.out.println(text);
        }
        catch(Exception e) {}
    }

About as useful in everyday coding as the byte2 method, but this comes into play more on international language support. Simply an example of offsets and lengths. Be careful not to extend the length of the array.

CODE

    public void chars2()
    {
        char text[] = {'T','h','i','s',' ','i','s',' ','a','n',' ','a','r','r','a','y',' ','o','f',' ','c','h','a','r'};
        String string = new String(text);
        System.out.println(string);
    }

Similar to the code at the very beginning, except we create the String object before passing it to the System.out.println command.

CODE

    public void chars3()
    {
        char text[] = {'T','h','i','s',' ','i','s',' ','a','n',' ','a','r','r','a','y',' ','o','f',' ','c','h','a','r'};
        String string = new String(text,0,4);
        System.out.println(string);
    }

Another example of using offsets and lengths to get a substring of the original text.

CODE

    public void string2()
    {
        String text = "This is a String";
        System.out.println(text);
        String text2 = new String(text);
        text = "";
        System.out.println(text2);
    }

By replacing the original string, we can see that the 2 Strings are completely unique, and stored in separate addresses. This is not actually completely true, but since String manipulation in this way is hard to impossible, they might as well be unique.

CODE

    public void stringBuffer()
    {
        StringBuffer buffer = new StringBuffer("Text");
        String text = new String(buffer);
        System.out.println(text);
        System.out.println(buffer);
    }

I have chosen to print the buffer here as well, as it is important to know that copying the data from a StringBuffer to a String, does not clear or flush the buffer.


String Information:
It is important to remember that in Java, a String is an object. Unlike the basic types: int, char, and all the others. This is important when expanding the use of Strings, as they can be passed and created in true Object Oriented Style, as well as directly stored in Vectors. Some complex examples will show how Strings are useful even when the end result is not necessarily a String.
Another point to keep in mind is that Strings for the most part are not efficient. When creating them every time the word new is used another String is created. The old String is discarded, not replaced. This is one of many advantages that C/C++ has over the manipulation of data with pointers.

Something to consider:
CODE

    public void delimiter()
    {
        String text = "A whole bunch of Text";
        System.out.println(text);
        text = "A whole \0bunch of Text";
        System.out.println(text);
    }

The delimiter in middle of the sentence changes the output.


Methods:
I know this is what many new Java coders were waiting to see, how to manipulate and process Strings.
I have omitted a few of the more obscure methods, a complete list is available from the java sun website documentation

char charAt(int index)
-Returns a char of the character at the given index.

int codePointAt(int index)
-Returns the code point value of the character at the given index.

int codePointBefore(int index)
- Returns the code point value of the character before the given index.

int codePointCount(int beginIndex, int endIndex)
-Returns the number of code points in the range between beginIndex and endIndex.

int compareTo(String string)
-Compares the original and string lexicographically, 0 is returned if the Strings match.

int compareToIgnoreCase(String string)
-Compares the original and string lexicographically ignoring the case of letters, 0 is returned if the Strings match.

String concat(String str)
-appends String str to the calling String (this).

boolean contains(CharSequence s)
-Returns true if the String contains the specified sequence.

static String copyValueOf(char[] data)
-Returns a String representing the supplied character array.

static String copyValueOf(char[] data, int offset, int length)
-Returns a String representing the character array at the supplied offset and length.

boolean endsWith(String suffix)
-Returns true if the end of the ‘this’ String matches the suffix supplied.

boolean equals(Object anObject)
-Compares ‘this’ String to the Object supplied.

boolean equalsIgnoreCase(String string)
-Returns true is ‘this’ String is the same as the String supplied.

byte[] getBytes()
-Returns a byte array of ‘this’ String.

void getChars(int start, int end, char[] dest, int destStart)
-Copies the characters in the range start to end, from ‘this’ String into the character array dest, starting at position destStart in the destination array.

int hashCode()
-Returns a hash code for ‘this’ String.

int indexOf(int ch)
-Returns the index value of the first occurrence of the character ch.

int indexOf(int ch, int offset)
-Returns the index value of the first occurrence of the character ch, which appears after the supplied offset.

int indexOf(String str)
-Returns the first occurrence of the supplied substring.

int indexOf(String str, int offset)
-Returns the first occurrence of the supplied substring, which appears after the supplied offset.

boolean isEmpty()
-Returns true if ‘this’ String’s length is 0.

int lastIndexOf(int ch)
-Returns the index value of the last occurrence of the character ch.

int lastIndexOf(String str)
-Returns the index value of the last occurrence of the substring str.

int length()
-Returns the length of the String. A count of all characters, including spaces.

boolean matches(String regex)
-Returns true if ‘this’ String matches the requirements of the regular expression regex.

String replace(char oldChar, char newChar)
-Returns a String with all occurrences of oldChar replaced by newChar.

String[] split(String regex)
-Splits the string into smaller Strings around the specified regular expression regex.

String substring(int startIndex, int endIndex)
-Returns a String that is a substring of the of the original, from startIndex to endIndex.

char[] toCharArray()
-Returns a char array of ‘this’ String.

String toLowerCase()
-Returns a lowercase version of ‘this’ string replacing all uppercase letters.

String toString()
-Returns a String of ‘this’ String.
*Not so important when used on Strings, but on other types and Objects it becomes very useful.

String toUpperCase()
-Returns an uppercase version of ‘this’ String replacing all lowercase letters.

String trim()
-Returns ‘this’ String with all leading and trailing white space omitted.

String valueOf(boolean b )
-Returns the String representation of the boolean argument b.

String valueOf(char c)
-Returns the String representation of the character argument c.

String valueOf(char[] c)
-Returns the String representation of the character array argument c.

String valueOf(double d)
-Returns the String representation of the double argument d.

String valueOf(float f)
-Returns the String representation of the float argument f.

String valueOf(int i)
-Returns the String representation of the integer argument i.

String valueOf(Object o)
-Returns the String representation of the Object argument o.


Examples:

CODE

        String string = "William_Wilson";
        char c = string.charAt(0);
        System.out.println(c);

The character c, now holds the character in string at index 0, which is ‘W’.

CODE

        String string  = "William_Wilson";
        String string2 = "William";
        String string3 = "_";
        String string4 = "Wilson";
                
        int i = string.compareTo(string2);
        System.out.println(i); //7 errors
        i = string.compareTo(string);
        System.out.println(i); //0 errors (same String)
        i = string3.compareTo(string4);
        System.out.println(i); //8 errors
        i = string4.compareTo(string3);
        System.out.println(i); //-8 errors

We get -8 in the last example because of the way the strings are compared. The first comparison is the length of each String. Since string3 is shorter than string4, a negative value is returned.

CODE

        String string2 = "William";
        String string3 = "_";
        String string4 = "Wilson";

        String s = string2;    //William
        System.out.println(s);    //William
        s = s.concat(string3);    //William + _
        System.out.println(s);    //William_
        s = s.concat(string4);    //William_ + Wilson
        System.out.println(s);    //William_Wilson

The String ‘this’ which is s in all cases, has the String passed to concat appended to the end.

CODE

        String string  = "William_Wilson";
        String string2 = "William";
        String string3 = "_";
        String string4 = "Wilson";

        boolean b = string.endsWith(string4); //true
        System.out.println(b);
        b = string.endsWith("son"); //true
        System.out.println(b);
        b = string.endsWith("SON"); //false
        System.out.println(b);
        b = string.endsWith(string3); //false
        System.out.println(b);

Notice that case matters.

CODE

        String string2 = "William";
        byte[] bytes = string2.getBytes();
        for(int j=0;j<bytes.length;++j)
        {
            System.out.print(bytes[j] + " ");
        }
        System.out.println();

Displays the ASCII code of each letter.

CODE

        String string2 = "William";
        char[] chars   = {'W','i','l','l','i','a','m'};

        char[] ch = new char[7];
        string2.getChars(0,string2.length(),ch,0);
        System.out.println(ch);
        System.out.println(chars);

The output of a known char array is added for comparison, thus being able to see that the result is exactly what is expected.

CODE

        String string  = "William_Wilson";

        int hash = string.hashCode();
        System.out.println(hash);

Straight forward, algorithm attempts to make a ‘unique’ hash value for the String. I use unique loosely, as it is possible to have collisions, but it is a very good algorithm.

CODE

        String string  = "William_Wilson";

        int w = string.indexOf('W'); //0
        System.out.println(w);
        w = string.indexOf('m'); //6
        System.out.println(w);

Simple really, returns the first index occurrence. Remember that arrays start at 0.

CODE

        String string  = "William_Wilson";

        Int w = string.indexOf('m',0); //6
        System.out.println(w);
        w = string.indexOf('m',7); //-1
        System.out.println(w);

Very similar, illustrates the negative return value if there are no more occurrences after the supplied index.

CODE

        String string  = "William_Wilson";
        String string2 = "William";
        String string3 = "_";

        System.out.println(string.length()); //14
        System.out.println(string2.length()); //7
        System.out.println(string3.length()); //1

Gets the length of the string.

CODE

        String string  = "William_Wilson";

        String[] set = string.split("_");
        for(int k=0;k<set.length;++k)
        {
            System.out.println(set[k]);
        }

Note that the character ‘_’ is not kept, as it is the defining split character.
I have intentionally avoided talking about Regular Expressions and done a simple case here. As Strings and not Regex is the topic of this tutorial.

CODE

        String string  = "William_Wilson";
        String string2 = "William";
        String string3 = "_";

        boolean b2 = string.startsWith("Wil"); //true
        System.out.println(b2);
        b2 = string2.startsWith("Wil"); //true
        System.out.println(b2);
        b2 = string3.startsWith("Wil"); //false
        System.out.println(b2);

The same as endsWith(), easy to implement and use.

CODE

        String string  = "William_Wilson";

        String sub = string.substring(0,6); //Willa
        System.out.println(sub);
        sub = string.substring(3,9); //liam_W
        System.out.println(sub);
        sub = string.substring(2,3); //l
        System.out.println(sub);
        sub = string.substring(9,13); //ilso
        System.out.println(sub);

Be careful with this method, the first index is included, while the second index is not.

CODE

        String string  = "William_Wilson";

        char[] cha = string.toCharArray();
        System.out.println(cha);

An easier option to getChars().

CODE

        System.out.println(string.toLowerCase());
        System.out.println(string.toUpperCase());

Note that symbols and numbers (_ 1 ,) are not affected.

CODE

        String spaces = "      lots of leading and trailling spaces        ";
        String trim   = spaces.trim();
        System.out.println(spaces);
        System.out.println(trim);

Leading and Trailing spaces are removed.

toString()
This method is a special case. It is inherited from the class Object. Thus all objects can use this method.
An Object has a String representation, either supplied or determine based on the memory address by the JVM. We can supply a toString method for each of our classes (objects) to handle how the object reacts.

Example:
CODE

public class Node
{
    int x;
    int y;
    
    public Node()
    {
        x = y = 0;
    }
    
    public Node(int anX, int aY)
    {
        x = anX;
        y = aY;
    }
    
    public static void main(String args[])
    {
        Node n1 = new Node();
        Node n2 = new Node(14, 7);
        
        System.out.println(n1);
        System.out.println(n2);
    }    
}

Your output will be something like this:
Node@16930e2
Node@108786b

Not very useful… but this is where the toString method comes in. We can print the Object in this case Node, directly if we simply supply a way to display the Object:
CODE

    public String toString()
    {
        return "(" + x + "," + y + ")";
    }

By adding this small method, our output becomes:
(0,0)
(14,7)

Much better!

Since the toString method always uses the this Object, and accepts no parameters, each object prints out it’s own information. The toString method appears to be a special case in some ways, but it really isn’t. You can do as many calculations beforehand in the method, as long as the return type is a String.
You can also call the toString method If you wish:
System.out.println(n1.toString());
This will give the same result as before. Java automatically looks for the toString() method when an object is used as a String. If none exists, then the default is supplied, but this is rarely good enough.
*Similar to constructors. The default constructor exists, even if you do not supply one.

Another example:
CODE

public class arrayTOstring
{
    int[] array;
    
    public arrayTOstring()
    {
        array = new int[] {0,0,0,0};
    }
    
    public arrayTOstring(int[] anArray)
    {
        array = anArray;
    }
    
    public String toString()
    {
        String out = "(";
        for(int i=0;i<array.length-1;++i)
        {
            out += String.valueOf(array[ i]) + ",";
        }
        out += String.valueOf(array[array.length-1]) + ")";
        return out;
    }
    
    public static void main(String args[])
    {
        arrayTOstring a1 = new arrayTOstring();
        arrayTOstring a2 = new arrayTOstring(new int[]{1,2,3,4,5,6,7});
        
        System.out.println(a1);
        System.out.println(a2);
    }    
}

Notice how the toString method does a for loop to evaluate the String representation of the array. The conversion can be done many ways, but this also illustrates, the use of the valueOf method. Any primitive can have a String representation in this way. It is true we can place primitives directly into a String:
out += array[ i] + ",";
But it is important to understand the options/alternatives and their uses.


Complex Code Examples:
*I make no claim that these are the most efficient choices, these examples are meant to show the largest number of String operations possible, to help you learn how to improve upon the code presented here.

I will not analyze the code in extreme detail as they are simple concepts from above, joined to perform useful tasks. I will describe the process and result.

CODE

    public String reverse(String s)
    {
        char[] chars = s.toCharArray();
        for(int i=0;i< s.length()/2;++i)
        {
            char c = chars[ i];
            chars[i] =  chars[chars.length-i-1];
            chars[chars.length-i-1] =  c;
        }
        return String.valueOf(chars);
    }

This method systematically swaps the starting and ending characters in a character array, to reverse a String. By accessing the String as an array of chars, the array becomes much easier to handle, and easily traversed with a for loop.

CODE

    public String invertCase(String s)
    {
        char[] chars = s.toCharArray();
        for(int i=0;i<s.length();++i)
        {
            if((String.valueOf(chars[ i])).toLowerCase().equals(String.valueOf(chars[ i])))
            {
                chars[ i] = (((String.valueOf(chars[ i])).toUpperCase()).toCharArray())[0];
            }
            else
            {
                chars[ i] = (((String.valueOf(chars[ i])).toLowerCase()).toCharArray())[0];
            }
        }
        return String.valueOf(chars);
    }

The above method inverts the case of letters in a String. Looping through the String as a char array.
Eg: a becomes A, and A becomes a
There are many, many String methods here. I will break down one of the replacements:
(((String.valueOf(chars[ i])).toUpperCase()).toCharArray())[0];
The best way to look at it is to move to the middle set of brackets:
(chars[ i]) – This is obviously a single char of the array.
Now expand one level:
(String.valueOf(chars[ i])) – This becomes the String representation of the char, required from the method toUpperCase()
Expand another level:
((String.valueOf(chars[ i])).toUpperCase()) – now we have an uppercase String
Last set of brackets:
(((String.valueOf(chars[ i])).toUpperCase()).toCharArray()) – a character array of the uppercase (converted) String.

Now since we know that the array contains a single character, and we are placing this character into a single char, and not an array, we need to specify a single character, as we did to start this entire process:
(((String.valueOf(chars[ i])).toUpperCase()).toCharArray())[0];

Once more, with replacement of type:
*I have made the item which will be substituted for in the next line italics, and the item which is the current replacement bold.

(((String.valueOf(chars[ i])).toUpperCase()).toCharArray())[0];
(((String.valueOf(char)).toUpperCase()).toCharArray())[0];
((String.toUpperCase()).toCharArray())[0];
(String.toCharArray())[0];
(char[])[0];
char;




I hope that you have learned something about Strings in Java, or reinforced things you already knew.
Good Luck and have fun with Strings.

-William_Wilson

Download Includes: Plain Text and MS Office 2007 Word Versions, plus source files for all examples.
Attached File  Java_String.zip ( 35.17k ) Number of downloads: 238
Go to the top of the page
+Quote Post


Register to Make This Ad Go Away!

PennyBoki
Group Icon



post 16 Sep, 2007 - 10:26 AM
Post #2
Excellent tutorial William that is all I have to say.
Go to the top of the page
+Quote Post

1lacca
Group Icon



post 16 Sep, 2007 - 02:38 PM
Post #3
Very good tutorial, and also a very good subject, I am sure it would answer a nice amount of question regularly surfacing on DIC.
However (as usual) I have found something that might need a bit more explanation:
QUOTE

String()
-Creates an empty String, which is essentially a single char as the null terminating character ‘\0’


QUOTE

int length()
-Returns the length of the String. A count of all characters, including spaces.


According to this, a String made with the given constructor is supposed to return 1 for length, but it is obviously not the case. I am not totally sure about this, but I think in Java - unlike C/C++ where this is true - the '\0' control char is not really used (not even defined as the end of String char?). Strings are immutable objects, and they store their length, so actually the end character is not needed. I've run this snippet:
CODE

        char a[] = {'a','b','\0','c'};
        System.out.println(""+new String(a));

And it's output shows that the '\0' char is not the end of the String - can't copy, but the '\0' turns up as a square, and the c char is visible as well. I haven't looked up the official doc on this, but if anyone knows anything more, please share - or I'll have to look up the spec ohmy.gif
Go to the top of the page
+Quote Post

William_Wilson
Group Icon



post 12 Dec, 2007 - 12:03 PM
Post #4
If you didn't point it out 1lacca I might get a big head tongue.gif


yes the '\0' character will print if it is converted from char to String, the conversion gets the ASCII value and converts it to a non-character eg the square character. When implanted directly in a String it will terminate the String.
Go to the top of the page
+Quote Post

1lacca
Group Icon



post 12 Dec, 2007 - 12:19 PM
Post #5
QUOTE(William_Wilson @ 12 Dec, 2007 - 09:03 PM) *

When implanted directly in a String it will terminate the String.

But how do you do that in Java?
Go to the top of the page
+Quote Post

Programmist
Group Icon



post 17 Dec, 2007 - 11:03 AM
Post #6
In a recent project I used the null character (ASCII 0) as a "padding" character for network transport level messages. I can tell you, from experience, that it does not terminate the String. You could try either of the following:

CODE
String a = "1234" + (char)0 + "56";
String b = "1234" + '\0' + "56";


And you'll find that both will insert a null character, but not terminate the String. You can verify this by printing the String and/or inspecting it with a debugger.

Also, The Java Language Specification says:
QUOTE
In the Java programming language, unlike C, an array of char is not a String, and neither a String nor an array of char is terminated by '\u0000' (the NUL character).



Go to the top of the page
+Quote Post

Programmist
Group Icon



post 17 Dec, 2007 - 11:46 AM
Post #7
Immutability
Also, something else about Strings and immutability. I've heard people say that they are inefficient compared to C++ strings. This may be true to some small extent (defrayed some by the JVMs constant String pool), but the design decision was made for very good reason. According to James Gosling (inventor of Java):

QUOTE
One of the things that forced Strings to be immutable was security. You have a file open method. You pass a String to it. And then it's doing all kind of authentication checks before it gets around to doing the OS call. If you manage to do something that effectively mutated the String, after the security check and before the OS call, then boom, you're in. But Strings are immutable, so that kind of attack doesn't work. That precise example is what really demanded that Strings be immutable.

So, while this may annoy C++ to Java converts at first, they'd be happy to know that their code won't be vulnerable to those kinds of attacks.

Constant String Pool Behavior
Another tidbit concerning the constant String pool. As you mentioned, invoking the new keyword does create a new instance of a String. So while the following code will return true:

CODE
String s1 = "Tony";
String s2 = "Tony";
System.out.println(s1==s2); // true.


This code will return false:

CODE
String s1 = "Tony";
String s2 = new String("Tony");
System.out.println(s1==s2); // false.


However, did you know that runtime concatenation also creates a new String:

CODE
String s1 = "Tony";
String s2 = "T" + "ony";
String s3 = "T";
String s4 = "ony";
s3+= s4;
System.out.println(s1==s2); //true.
System.out.println(s1==s3); //false.


But, if you use constant Strings (final in Java) then it's obviously a compile-time concatenation.

CODE
String s1 = "Tony";
final String s2 = "T";
final String s3 = "ony";
System.out.println(s1==s2+s3); //true.


Since s2 and s3 were marked as final then s2+s3 is the same as saying "T" + "ony". So this is a compile-time concatenation and no new String is created.




Go to the top of the page
+Quote Post

1lacca
Group Icon



post 18 Dec, 2007 - 03:52 AM
Post #8
QUOTE(Programmist @ 17 Dec, 2007 - 08:46 PM) *

According to James Gosling (inventor of Java):

QUOTE
One of the things that forced Strings to be immutable was security. You have a file open method. You pass a String to it. And then it's doing all kind of authentication checks before it gets around to doing the OS call. If you manage to do something that effectively mutated the String, after the security check and before the OS call, then boom, you're in. But Strings are immutable, so that kind of attack doesn't work. That precise example is what really demanded that Strings be immutable.

So, while this may annoy C++ to Java converts at first, they'd be happy to know that their code won't be vulnerable to those kinds of attacks.


With all due respect to Mr Gosling, I've read several security experts writing that storing sensitive data in a Java String is a risk, because it is immutable. They've reasoned that it is better to put it into mutable objects (char arrays, etc.) instead, so you can overwrite it once you don't need them anymore and they don't stay in memory hanging around and maybe even swapped out before garbage collection making them visible to anyone who can access the swap file. I really don't know who to trust here, but probably both strategies have their respective uses.

QUOTE(Programmist @ 17 Dec, 2007 - 08:03 PM) *

Also, The Java Language Specification says:
QUOTE
In the Java programming language, unlike C, an array of char is not a String, and neither a String nor an array of char is terminated by '\u0000' (the NUL character).



Now that is what I inteded to look-up, just never got around to do so.
Thank you Programmist!

QUOTE(Programmist @ 17 Dec, 2007 - 08:03 PM) *

In a recent project I used the null character (ASCII 0) as a "padding" character for network transport level messages. I can tell you, from experience, that it does not terminate the String. You could try either of the following:

CODE
String a = "1234" + (char)0 + "56";
String b = "1234" + '\0' + "56";


And you'll find that both will insert a null character, but not terminate the String. You can verify this by printing the String and/or inspecting it with a debugger.


Yup, this is what I've shown in post #3 smile.gif
Go to the top of the page
+Quote Post

Programmist
Group Icon



post 18 Dec, 2007 - 09:50 AM
Post #9
QUOTE(1lacca @ 18 Dec, 2007 - 05:52 AM) *

With all due respect to Mr Gosling, I've read several security experts writing that storing sensitive data in a Java String is a risk, because it is immutable. They've reasoned that it is better to put it into mutable objects (char arrays, etc.) instead, so you can overwrite it once you don't need them anymore and they don't stay in memory hanging around and maybe even swapped out before garbage collection making them visible to anyone who can access the swap file. I really don't know who to trust here, but probably both strategies have their respective uses.


I've heard the other argument and it is valid. Thanks for pointing that out. I think, however, that it is exceptional given that someone would have to be snooping on the machine where the JVM is running. But, this is no excuse to let it go. That's why the Java API uses char arrays in certain areas, like: javax.swing.JPasswordField.getPassword(). I think in this situation "largest danger" was taken care of by making Strings immutable and then the exceptional cases are handled individually.

QUOTE(1Lacca @ 17 Dec, 2007 - 08:03 PM) *

Yup, this is what I've shown in post #3 smile.gif

I thought it needed more examples. smile.gif
Go to the top of the page
+Quote Post

1lacca
Group Icon



post 18 Dec, 2007 - 11:46 AM
Post #10
True, one can never have too many examples.
Go to the top of the page
+Quote Post

Evileyeball
*



post 6 Jan, 2008 - 08:54 AM
Post #11
Thanks, this is helping me out alot with my term project, as is this whole forum, you guys are great. icon_up.gif
Go to the top of the page
+Quote Post

William_Wilson
Group Icon



post 6 Jan, 2008 - 01:07 PM
Post #12
wow, you guys are amazing smile.gif

Yes it must be a compiler issue, as running within my common compiler it cuts off at the '\0' character, but in the command prompt it carries on as if it was not there. From all I'd read it did not seem like that should be the case as you have clearly demonstrated, but it was hard to think the output i was seeing was wrong too.
Guess you can always learn something more. smile.gif
Go to the top of the page
+Quote Post


2 Pages V  1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members: