DescriptionTraditional fixed pass-phrase or text-dependent speaker verification systems are vulnerable to replay or spoofing attacks. Random pass-phrase generation, speech verification and text-independent speaker verification could be combined to create a composite speaker verification system, robust to this spoofing problem. This thesis deals with combining speech verification with text-independent speaker verification for this purpose. A method to perform robust, automatic speech verification using a speech recognizer in a forced alignment mode is proposed and evaluated. A text-independent speaker verification system was developed in MATLAB for training and evaluating Gaussian mixture density-based, target speaker and background speaker models. Equal-error rate is the performance metric used in all speaker verification evaluations. To speed up background model training, a simple technique based on sub-sampling or decimating speech frames is presented. Evaluation of two different feature extraction implementations along with an evaluation of the impact on performance of different configurations of the speech features is also carried out. Further, to mitigate problems with reduced training data and to improve performance, Bayesian adaptation of background speaker models with target speaker training data is used to create target speaker models. The performance of these models is evaluated and compared with conventional target speaker models. The impact of the length of test-utterances, variance limiting and the use of training data from multiple recording sessions has also been investigated.