`
RednaxelaFX
  • 浏览: 3016049 次
  • 性别: Icon_minigender_1
  • 来自: 海外
社区版块
存档分类
最新评论

数组协变带来的静态类型漏洞

    博客分类:
  • Java
阅读更多
在刚才一个通不过Java字节码校验的例子的例子里,我们看到JVM会对其所加载的.class文件做校验,以保证类型安全。但Java里有这么一种情况,是编译器和JVM的字节码校验都无法检测到,而要到实际运行的时候才能发现的错误——数组的协变导致的类型静态系统漏洞。

还是像前一帖一样,用ASM来生成字节码:
import java.io.FileOutputStream;
import org.objectweb.asm.ClassWriter;
import org.objectweb.asm.MethodVisitor;
import org.objectweb.asm.Opcodes;

public class TestASM implements Opcodes {
    public static void main(String[] args) throws Exception {
        ClassWriter cw = new ClassWriter(0);
        cw.visit(
            V1_5,               // class format version
            ACC_PUBLIC,         // class modifiers
            "TestVerification", // class name fully qualified name
            null,               // generic signature
            "java/lang/Object", // super class fully qualified name
            new String[] { }    // implemented interfaces
        );
        
        MethodVisitor mv = cw.visitMethod(
            ACC_PUBLIC + ACC_STATIC,   // access modifiers
            "main",                    // method name
             "([Ljava/lang/String;)V", // method description
             null,                     // generic signature
             null                      // exceptions
        );
        mv.visitCode();
        mv.visitInsn(ICONST_1);
        mv.visitTypeInsn(ANEWARRAY, "java/lang/Float");
        mv.visitTypeInsn(CHECKCAST, "[Ljava/lang/Object;");
        mv.visitVarInsn(ASTORE, 0);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitLdcInsn("a string");
        mv.visitInsn(AASTORE);
        mv.visitVarInsn(ALOAD, 0);
        mv.visitInsn(ICONST_0);
        mv.visitInsn(AALOAD);
        mv.visitMethodInsn(INVOKEVIRTUAL, "java/lang/Object", "toString", "()V");
        mv.visitInsn(RETURN);
        mv.visitMaxs(3, 1);
        mv.visitEnd(); // end method
        cw.visitEnd(); // end class
        
        byte[] clz = cw.toByteArray();
        FileOutputStream out = new FileOutputStream("TestVerification.class");
        out.write(clz);
        out.close();
    }
}


得到的是:
public class TestVerification extends java.lang.Object
  minor version: 0
  major version: 49
  Constant pool:
const #1 = Asciz        TestVerification;
const #2 = class        #1;     //  TestVerification
const #3 = Asciz        java/lang/Object;
const #4 = class        #3;     //  java/lang/Object
const #5 = Asciz        main;
const #6 = Asciz        ([Ljava/lang/String;)V;
const #7 = Asciz        java/lang/Float;
const #8 = class        #7;     //  java/lang/Float
const #9 = Asciz        [Ljava/lang/Object;;
const #10 = class       #9;     //  "[Ljava/lang/Object;"
const #11 = Asciz       a string;
const #12 = String      #11;    //  a string
const #13 = Asciz       toString;
const #14 = Asciz       ()V;
const #15 = NameAndType #13:#14;//  toString:()V
const #16 = Method      #4.#15; //  java/lang/Object.toString:()V
const #17 = Asciz       Code;

{
public static void main(java.lang.String[]);
  Code:
   Stack=3, Locals=1, Args_size=1
   0:   iconst_1
   1:   anewarray       #8; //class java/lang/Float
   4:   checkcast       #10; //class "[Ljava/lang/Object;"
   7:   astore_0
   8:   aload_0
   9:   iconst_0
   10:  ldc     #12; //String a string
   12:  aastore
   13:  aload_0
   14:  iconst_0
   15:  aaload
   16:  invokevirtual   #16; //Method java/lang/Object.toString:()V
   19:  return

}


这次的代码其实直接用Java源码也能表示出来,也就是:
public class TestVerification {
    public static void main(String[] args) {
        Object[] array = (Object[]) new Float[1];
        array[0] = "a string"; // 问题出在这里
        array[0].toString();
    }
}

编译不会有任何问题。这代码也是完全符合Java规范,也满足JVM的静态校验对类型的要求,所以加载时的校验也没问题。

但是运行的话……
Exception in thread "main" java.lang.ArrayStoreException: java.lang.String
        at TestVerification.main(Unknown Source)

很明显我们没办法把一个String类型的对象保存到一个Float[]里,但由于Java数组是协变的,所以Java的静态类型系统允许我们这么做,却会到运行时扔异常出来。

.NET很不幸的模仿了Java的这个特性,也把数组设计为协变的。因而CLI与JVM一样(JVM:aastore;CLI:stelem),也必须在运行时对数组的保存做动态类型检查。这对性能的影响自然不太好,而且也使得VM的实现更复杂……诶。

《Virtual Machines: Versatile Platforms for Systems and Processes》影印版第289页倒数第二段提到:
引用
Hence, if an object is accessed, the field information for the access can also be checked statically (there is an exception for arrays, given in the next paragraph).

然后在接下来的一段里,这本书却只提到了动态检查数组访问时越界检查,而没有提到由协变带来的静态类型漏洞。我觉得这里还是提一下协变问题比较好的。毕竟,数组长度并不是Java的静态类型的一部分,它的检查只能留待运行时检查(VM可以根据数据流分析而消除许多数组越界和空指针检查就是了);而类型协变是静态类型系统的一部分,却有漏洞所以运行时仍然要检查,这就不爽了。

看看Martin Odersky最近的一个访谈里对Java数组的协变的评论:
Martin Odersky 写道
Bill Venners: You said you found it frustrating at times to have the constraints of needing to be backwards compatible with Java. Can you give some specific examples of things you couldn't do when you were trying to live within those constraints, which you were then able to do when you changed to doing something that's binary but not source compatible?

Martin Odersky: In the generics design, there were a lot of very, very hard constraints. The strongest constraint, the most difficult to cope with, was that it had to be fully backwards compatible with ungenerified Java. The story was the collections library had just shipped with 1.2, and Sun was not prepared to ship a completely new collections library just because generics came about. So instead it had to just work completely transparently.

That's why there were a number of fairly ugly things. You always had to have ungenerified types with generified types, the so called raw types. Also you couldn't change what arrays were doing so you had unchecked warnings. Most importantly you couldn't do a lot of the things you wanted to do with arrays, like generate an array with a type parameter T, an array of something where you didn't know the type. You couldn't do that. Later in Scala we actually found out how to do that, but that was possible only because we could drop in Scala the requirement that arrays are covariant.

Bill Venners: Can you elaborate on the problem with Java's covariant arrays?

Martin Odersky: When Java first shipped, Bill Joy and James Gosling and the other members of the Java team thought that Java should have generics, only they didn't have the time to do a good job designing it in. So because there would be no generics in Java, at least initially, they felt that arrays had to be covariant. That means an array of String is a subtype of array of Object, for example. The reason for that was they wanted to be able to write, say, a “generic” sort method that took an array of Object and a comparator and that would sort this array of Object. And then let you pass an array of String to it. It turns out that this thing is type unsound in general. That's why you can get an array store exception in Java. And it actually also turns out that this very same thing blocks a decent implementation of generics for arrays. That's why arrays in Java generics don't work at all. You can't have an array of list of string, it's impossible. You're forced to do the ugly raw type, just an array of list, forever. So it was sort of like an original sin. They did something very quickly and thought it was a quick hack. But it actually ruined every design decision later on. So in order not to fall into the same trap again, we had to break off and say, now we will not be upwards compatible with Java, there are some things we want to do differently.


P.S. 不知道协变是什么的同学可以读读Wikipedia上的词条

P.P.S 不认识Martin Odersky的同学请留意:只要用到Java 5的泛型,你们的代码里就有他的痕迹。他是Pizza语言的设计者,后来参与了GJ(Generic Java)的设计;后者就是后来Java 5中的泛型的基石。Martin还设计了Scala << 知道Scala的人肯定比知道Pizza的多多了……
分享到:
评论
3 楼 Saito 2009-05-05  
RednaxelaFX 写道

Saito 写道请您移驾看个东西.. 答疑解惑
http://www.iteye.com/topic/378747
OK,已回复。其实观察现象的时候大家都经常犯迷糊。刚才我写前一帖的时候就犯迷糊没写return,虽然没影响结论不过还是不太好。细心这种习惯真难培养……至少对我来说 XD

   
   呵呵. 再次感谢. 
2 楼 RednaxelaFX 2009-05-05  
Saito 写道
请您移驾看个东西.. 答疑解惑
http://www.iteye.com/topic/378747

OK,已回复。其实观察现象的时候大家都经常犯迷糊。刚才我写前一帖的时候就犯迷糊没写return,虽然没影响结论不过还是不太好。细心这种习惯真难培养……至少对我来说 XD
1 楼 Saito 2009-05-05  
请您移驾看个东西.. 答疑解惑.  


          http://www.iteye.com/topic/378747

相关推荐

Global site tag (gtag.js) - Google Analytics