82 research outputs found

    Fixing the train-test resolution discrepancy

    Get PDF
    Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the typical size of the objects seen by the classifier at train and test time. We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time.We then propose a simple yet effective and efficient strategy to optimize the classifier performance when the train and test resolutions differ. It involves only a computationally cheap fine-tuning of the network at the test resolution. This enables training strong classifiers using small training images. For instance, we obtain 77.1% top-1 accuracy on ImageNet with a ResNet-50 trained on 128×128 images, and 79.8% with one trained on 224×224 image. In addition, if we use extra training data we get 82.5% with the ResNet-50 train with 224×224 images.Conversely, when training a ResNeXt-101 32x48d pretrained in weakly-supervised fashion on 940 million public images at resolution 224×224 and further optimizing for test resolution 320×320, we obtain a test top-1 accuracy of 86.4% (top-5: 98.0%) (single-crop). To the best of our knowledge this is the highest ImageNet single-crop, top-1 and top-5 accuracy to date

    Co-training 2L2^L Submodels for Visual Recognition

    Full text link
    We introduce submodel co-training, a regularization method related to co-training, self-distillation and stochastic depth. Given a neural network to be trained, for each sample we implicitly instantiate two altered networks, ``submodels'', with stochastic depth: we activate only a subset of the layers. Each network serves as a soft teacher to the other, by providing a loss that complements the regular loss provided by the one-hot label. Our approach, dubbed cosub, uses a single set of weights, and does not involve a pre-trained external model or temporal averaging. Experimentally, we show that submodel co-training is effective to train backbones for recognition tasks such as image classification and semantic segmentation. Our approach is compatible with multiple architectures, including RegNet, ViT, PiT, XCiT, Swin and ConvNext. Our training strategy improves their results in comparable settings. For instance, a ViT-B pretrained with cosub on ImageNet-21k obtains 87.4% top-1 acc. @448 on ImageNet-val

    Facile synthesis of 3-substituted thieno[3,2-b]furan derivatives

    Get PDF
    A facile synthesis of dimethyl 3-hydroxythieno[3,2-b]furan-2,5-dicarboxylate is reported from the available methyl thioglycolate and dimethyl acetylenedicarboxylate starting materials. This compound represents an efficient precursor for the synthesis of 3-substituted thieno[3,2-b]furan derivatives
    corecore