Deep Learning (DL) has achieved great success in various applications in recent times. However, the Deep Neural Networks (DNNs) can be easily fooled by adversarial input samples.

These fabricated samples can lead to various misbehaviours of the DL models which cannot be discovered by humans.

Download Infographic

Download and share the infographic for free with employees.

Types of adversarial attacks

  • White box: The adversaries are assumed to have full knowledge of their target DL model, including model architecture.
  • Black box: The adversaries only have access to the inputs and outputs of the DL model, and knows nothing about the underlying architecture

How to protect your neural network from adversarial attacks?

Adversarial training:

Generate some adversarial examples against your network and train the model not to be fooled by cybercriminals.

Defensive distillation:

Train a secondary model whose surface is smoothed in the direction a cybercriminal tries to exploit and make it difficult to identify the adversarial input tweaks that lead to misclassification.