Description: Deep learning based method for Video Face Forgery Detection that uses two stages: a fully temporal convolution network (FTCN) focusing on temporal feature extraction by reducing spatial kernel size to 1, and a Temporal Transformer network for long-term coherence exploration.