MHA部署筆記(centos6+mysql5.6)

  • 2019 年 10 月 5 日
  • 筆記

環境及軟件版本:

    CentOS6.5x86_64

    MySQL5.6.34編譯安裝版

    MHA版本:mha4mysql-manager-0.56-0.el6.noarch.rpm mha4mysql-node-0.56-0.el6.noarch.rpm

節點角色:

    node93:10.1.20.93   默認主庫

    node94:10.1.20.94   從庫1,原先主庫宕機後可提升為主庫【mha管理節點也部署在這台機器上】

    node95:10.1.20.95   從庫2,不允許提升為主庫

   準備的VIP是 10.1.20.100/24

step1、配置主從關係

node93的/etc/my.cnf配置文件部分關鍵地方:

[mysqld]

port            = 3306

socket          = /tmp/mysql.sock

datadir         = /bdata/data/nowdb2

innodb_file_per_table=ON

character-set-server = utf8

default_storage_engine = InnoDB

skip-innodb_adaptive_hash_index

master_info_repository = TABLE

relay_log_info_repository = TABLE

relay_log_recovery = 1 #crash safe

log-bin=mysql-bin

binlog_format=row

sync_binlog =1  #確保事務提交的時候BINLOG落盤

log-slave-updates

log_bin_trust_function_creators =1

binlog_rows_query_log_events=ON  #記錄執行的語句到BINLOG query event

server-id=1020093

relay_log_purge=0

read_only=1

node94的/etc/my.cnf配置文件部分關鍵地方:

[mysqld]

port            = 3306

socket          = /tmp/mysql.sock

datadir         = /bdata/data/nowdb2

innodb_file_per_table=ON

character-set-server = utf8

default_storage_engine = InnoDB

skip-innodb_adaptive_hash_index

master_info_repository = TABLE

relay_log_info_repository = TABLE

relay_log_recovery = 1 #crash safe

log-bin=mysql-bin

binlog_format=row

sync_binlog =1  #確保事務提交的時候BINLOG落盤

log-slave-updates

log_bin_trust_function_creators =1

binlog_rows_query_log_events=ON  #記錄執行的語句到BINLOG query event

server-id=1020094

relay_log_purge=0

read_only=1

node95的/etc/my.cnf配置文件部分關鍵地方:

[mysqld]

port            = 3306

socket          = /tmp/mysql.sock

datadir         = /bdata/data/nowdb2

innodb_file_per_table=ON

character-set-server = utf8

default_storage_engine = InnoDB

skip-innodb_adaptive_hash_index

master_info_repository = TABLE

relay_log_info_repository = TABLE

relay_log_recovery = 1 #crash safe

log-bin=mysql-bin

binlog_format=row

sync_binlog =1  #確保事務提交的時候BINLOG落盤

log-slave-updates

log_bin_trust_function_creators = 1

binlog_rows_query_log_events=ON  #記錄執行的語句到BINLOG query event

server-id=1020095

relay_log_purge=0

read_only=1

在node93上創建複製權限的賬號,GRANT REPLICATION SLAVE ,REPLICATION CLIENT ON *.* TO 'rpl'@'10.1.%.%' IDENTIFIED BY 'rpl';

然後配置1主2從,(具體步驟略過)。

注意:我們要確保能成為主庫的節點(node93、node94)都存在主從同步賬號,如果node94上不存在rpl賬號,就到node94節點去手工添加即可。

主從關係建立好後,我們在master上創建個mha管理賬號,後期會用到:

grant all on *.* to 'mhauser'@'10.1.%.%' identified by 'Abcd@1234';   

(管理賬號要在node93、node94、node95所有節點都存在)

step2、安裝MHA

因為MHA依賴於SSH,因此需要在3台主機之間建立SSH免秘鑰登陸。步驟略過。

3個節點都安裝perl包:

yum install perl perl-DBD-MySQL perl-CPAN perl-devel perl-Time-HiRes

node93-node95上都安裝node包:

rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

node94上安裝Manager包(當然,我們在3個節點都安裝上Manager包也沒問題):

rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

在node94初始化MHA

mkdir /etc/masterha/

vim /etc/masterha/app1.cnf 內容如下:

[server default]

user=mhauser

password=Abcd@1234

manager_workdir=/data/masterha/app1

manager_log=/data/masterha/app1/manager.log

remote_workdir=/data/masterha/app1

ssh_user=root

repl_user=rpl

repl_password=rpl

ping_interval=1

master_binlog_dir=/bdata/data/nowdb2/    # 這個路徑和你的mysql的binlog存放路徑要一致

master_ip_failover_script=/etc/masterha/master_ip_failover

report_script=/etc/masterha/send_report

#通過第三方機器確認目標主庫是否存活,不是必須的,就算沒有也是能用

#secondary_check_script=masterha_secondary_check -s remote_host1 -s remote_host2

#故障發生後關閉主機的腳本,不是必須的,但是你要設置為空

# shutdown_script=""

#手動在線切換VIP腳本,不是必須的,就算沒有也是能用,

#如果你有keepalived這種來做切換VIP就可以直接不用了

master_ip_online_change_script==/etc/masterha/master_ip_online_change

[server1]

hostname=10.1.20.93

candidate_master=1

[server2]

hostname=10.1.20.94

candidate_master=1

[server3]

hostname=10.1.20.95

no_master=1   # 禁止

在node94上添加腳本/etc/masterha/master_ip_failover  (裏面填寫上相關的VIP的信息)

#!/usr/bin/env perl

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

my (

   $command,          $ssh_user,        $orig_master_host, $orig_master_ip,

   $orig_master_port, $new_master_host, $new_master_ip,    $new_master_port,

   $orig_master_ssh_port, $new_master_ssh_port

);

my $vip ='10.1.20.100';  # Virtual IP

my $devic='eth0';

my $key = "0";

my$net_mask='255.255.255.0';

my $ssh_start_vip ="/sbin/ifconfig $devic:$key $vip netmask $net_mask";

my $ssh_stop_vip ="/sbin/ifconfig $devic:$key down";

my$mysql_conf="/etc/my.cnf";

my $open_readonly="/bin/sed-i 's/.*read_only.*/read_only=1/g' $mysql_conf ";

my $close_readonly="/bin/sed-i 's/.*read_only.*/read_only=0/g' $mysql_conf ";

my $open_relaylog_purge="/bin/sed-i 's/.*relay_log_purge.*/relay_log_purge=0/g' $mysql_conf ";

my$close_relaylog_purge="/bin/sed -i's/.*relay_log_purge.*/relay_log_purge=0/g' $mysql_conf ";

GetOptions(

   'command=s'          =>$command,

   'ssh_user=s'         =>$ssh_user,

   'orig_master_host=s' => $orig_master_host,

   'orig_master_ip=s'   =>$orig_master_ip,

   'orig_master_port=i' => $orig_master_port,

   'orig_master_ssh_port=i' => $orig_master_ssh_port,

   'new_master_host=s'  => $new_master_host,

   'new_master_ip=s'    =>$new_master_ip,

   'new_master_port=i'  =>$new_master_port,

   'new_master_ssh_port=i' => $new_master_ssh_port,

);

exit &main();

sub main {

   print "nnIN SCRIPTTEST====$ssh_stop_vip==$ssh_start_vip===nn";

   if ( $command eq "stop" || $command eq "stopssh" ) {

        # $orig_master_host, $orig_master_ip,$orig_master_port are passed.

        # If you manage master ip address atglobal catalog database,

        # invalidate orig_master_ip here.

        my $exit_code = 1;

        eval {

            print "Disabling the VIP onold master: $orig_master_host n";

            &stop_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn "Got Error: $@n";

            exit $exit_code;

        }

        exit $exit_code;

   }

   elsif ( $command eq "start" ) {

        # all arguments are passed.

        # If you manage master ip address atglobal catalog database,

        # activate new_master_ip here.

        # You can also grant write access(create user, set read_only=0, etc) here.

        my $exit_code = 10;

        eval {

            print "Enabling the VIP – $vipon the new master – $new_master_host n";

            &start_vip();

            $exit_code = 0;

        };

        if ($@) {

            warn $@;

            exit $exit_code;

        }

        exit $exit_code;

   }

   elsif ( $command eq "status" ) {

        print "Checking the Status of thescript.. OK n";

       # `ssh $ssh_user@cluster1 "$ssh_start_vip "`;

        exit 0;

   }

   else {

        &usage();

        exit 1;

   }

}

# A simple system call that enablethe VIP on the new master

sub start_vip() {

   `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;

   print "Disable read_only and relay_log_purge in my.cnf –  on the new master – $new_master_hostn";

   `ssh $ssh_user@$new_master_host " $close_readonly "`;

   `ssh $ssh_user@$new_master_host " $close_relaylog_purge "`;

}

# A simple system call thatdisable the VIP on the old_master

sub stop_vip() {

   `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;

  print "Enable read_only and relay_log_purge in my.cnf –  on the orig master – $orig_master_hostn";

   `ssh $ssh_user@$orig_master_host " $open_readonly "`;

   `ssh $ssh_user@$orig_master_host " $open_relaylog_purge "`;

}

sub usage {

   print

   "Usage: master_ip_failover –command=start|stop|stopssh|status–orig_master_host=host –orig_master_ip=ip –orig_master_port=port–new_master_host=host –new_master_ip=ip –new_master_port=port–orig_master_ssh_port=ssh_port –new_master_ssh_port = ssh_portn";

}

在node94上添加腳本/etc/masterha/send_report(裏面填寫上相關的smtp賬號的信息):

#!/usr/bin/perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful,

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

#  along with this program; if not, write to the Free Software

# Foundation, Inc.,

# 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 USA

## Note: This is a sample scriptand is not complete. Modify the script based on your environment.

use strict;

use warnings FATAL => 'all';

use Mail::Sender;

use Getopt::Long;

#new_master_host and new_slave_hostsare set only when recovering master succeeded

my ( $dead_master_host,$new_master_host, $new_slave_hosts, $subject, $body ,$conf);

my$smtp='smtp.exmail.qq.com';

my$mail_from='[email protected]';

my$mail_user='[email protected]';

my $mail_pass='xxxxxxx';

my$mail_to=['[email protected]'];

GetOptions(

 'orig_master_host=s' => $dead_master_host,

 'new_master_host=s'  =>$new_master_host,

 'new_slave_hosts=s'  =>$new_slave_hosts,

 'subject=s'          =>$subject,

 'body=s'             => $body,

 'conf=s'             => $conf,

);

mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);

check_if_sendmail_ok('/tmp/monitormail.log');

sub mailToContacts {

   my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;

   open my $DEBUG, "> /tmp/monitormail.log"

        or die "Can't open the debug      file:$!n";

   my $sender = new Mail::Sender {

        ctype       => 'text/plain; charset=utf-8',

        encoding    => 'utf-8',

        smtp        => $smtp,

        from        => $mail_from,

        auth        => 'LOGIN',

        TLS_allowed => '0',

        authid      => $user,

        authpwd     => $passwd,

        to          => $mail_to,

        subject     => $subject,

        debug       => $DEBUG

   };

   $sender->MailMsg(

        {  msg   => $msg,

            debug => $DEBUG

        }

   ) or print $Mail::Sender::Error;

   return 1;

}

sub check_if_sendmail_ok{

    #>>250 2.0.0 Ok: queued as 3532C6DA009D

    #<<QUIT

    #>>221 2.0.0 Bye

    my$logf = shift;

    openRLOG, $logf or die "cannot open file $logf.n";

    my@log = <RLOG>;

    closeRLOG;

    my$val = 0;

    if($log[$#log]=~ m/>>s221s.*sBye/){

       print"Meet Bye.t";

       $val++;

    }

    if($log[$#log-1]=~ m/<<sQUIT/){

       print"Meet QUIT.t";

       $val++;

    }

    if($log[$#log-2]=~ m/>>s250s.*sOk: queued/){

       print"Meet queued.t";

       $val++;

    }

    print"n";

    if($val== 3){

       print"send mail success.n";

    }

    else{

       print"send mail failed.check DNS/SMTP confign";

    }

    return$val;

}

# Do whatever you want here

exit 0;

在node94上添加腳本/etc/masterha/master_ip_online_change(裏面填寫上相關的VIP的信息):

#!/usr/bin/env perl

# Copyright (C) 2011 DeNA Co.,Ltd.

#

# This program is free software; you can redistribute it and/or modify

# it under the terms of the GNU General Public License as published by

# the Free Software Foundation; either version 2 of the License, or

# (at your option) any later version.

#

# This program is distributed in the hope that it will be useful,

# but WITHOUT ANY WARRANTY; without even the implied warranty of

# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

# GNU General Public License for more details.

#

# You should have received a copy of the GNU General Public License

#  along with this program; if not, write to the Free Software

# Foundation, Inc.,

# 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301 USA

## Note: This is a sample scriptand is not complete. Modify the script based on your environment.

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

use MHA::DBHelper;

use MHA::NodeUtil;

use Time::HiRes qw( sleepgettimeofday tv_interval );

use Data::Dumper;

my $_tstart;

my $_running_interval = 0.1;

my (

 $command,             $orig_master_is_new_slave, $orig_master_host,

 $orig_master_ip,      $orig_master_port,        $orig_master_user,

 $orig_master_password, $orig_master_ssh_user,     $new_master_host,

 $new_master_ip,       $new_master_port,          $new_master_user,

 $new_master_password, $new_master_ssh_user

);

my $vip ='10.1.20.100/24';

my $key = '0';

my $ssh_start_vip ="/sbin/ifconfig eth0:$key $vip";

my $ssh_stop_vip ="/sbin/ifconfig eth0:$key down";

my $orig_master_ssh_port = 22;

my $new_master_ssh_port = 22;

GetOptions(

 'command=s'                =>$command,

 'orig_master_is_new_slave' => $orig_master_is_new_slave,

 'orig_master_host=s'       =>$orig_master_host,

 'orig_master_ip=s'         =>$orig_master_ip,

 'orig_master_port=i'       =>$orig_master_port,

 'orig_master_user=s'       =>$orig_master_user,

 'orig_master_password=s'   =>$orig_master_password,

 'orig_master_ssh_user=s'   =>$orig_master_ssh_user,

 'new_master_host=s'        =>$new_master_host,

 'new_master_ip=s'          =>$new_master_ip,

 'new_master_port=i'        =>$new_master_port,

 'new_master_user=s'        =>$new_master_user,

 'new_master_password=s'    =>$new_master_password,

 'new_master_ssh_user=s'    =>$new_master_ssh_user,

 'orig_master_ssh_port=i'    =>$orig_master_ssh_port,

 'new_master_ssh_port=i'    =>$new_master_ssh_port,

);

exit &main();

sub current_time_us {

 my ( $sec, $microsec ) = gettimeofday();

 my $curdate = localtime($sec);

 return $curdate . " " . sprintf( "%06d", $microsec);

}

sub sleep_until {

 my $elapsed = tv_interval($_tstart);

 if ( $_running_interval > $elapsed ) {

   sleep( $_running_interval – $elapsed );

 }

}

sub get_threads_util {

 my $dbh                    = shift;

 my $my_connection_id       =shift;

 my $running_time_threshold = shift;

 my $type                   =shift;

 $running_time_threshold = 0 unless ($running_time_threshold);

 $type                   = 0 unless($type);

 my @threads;

 my $sth = $dbh->prepare("SHOW PROCESSLIST");

 $sth->execute();

 while ( my $ref = $sth->fetchrow_hashref() ) {

   my $id         = $ref->{Id};

   my $user       = $ref->{User};

   my $host       = $ref->{Host};

   my $command    =$ref->{Command};

   my $state      = $ref->{State};

   my $query_time = $ref->{Time};

   my $info       = $ref->{Info};

   $info =~ s/^s*(.*?)s*$/$1/ if defined($info);

   next if ( $my_connection_id == $id );

   next if ( defined($query_time) && $query_time < $running_time_threshold);

   next if ( defined($command)   && $command eq "Binlog Dump" );

   next if ( defined($user)      && $user eq "system user" );

   next

      if ( defined($command)

      && $command eq "Sleep"

      && defined($query_time)

     && $query_time >= 1);

   if ( $type >= 1 ) {

      next if ( defined($command) &&$command eq "Sleep" );

      next if ( defined($command) &&$command eq "Connect" );

   }

   if ( $type >= 2 ) {

      next if ( defined($info) && $info=~ m/^select/i );

      next if ( defined($info) && $info=~ m/^show/i );

   }

   push @threads, $ref;

 }

 return @threads;

}

sub main {

 if ( $command eq "stop" ) {

   ## Gracefully killing connections on the current master

   # 1. Set read_only= 1 on the new master

   # 2. DROP USER so that no app user can establish new connections

   # 3. Set read_only= 1 on the current master

   # 4. Kill current queries

   # * Any database access failure will result in script die.

   my $exit_code = 1;

   eval {

      ## Setting read_only=1 on the new master(to avoid accident)

      my $new_master_handler = newMHA::DBHelper();

      # args: hostname, port, user, password,raise_error(die_on_error)_or_not

      $new_master_handler->connect($new_master_ip, $new_master_port,

        $new_master_user, $new_master_password,1 );

      print current_time_us() . " Setread_only on the new master.. ";

     $new_master_handler->enable_read_only();

      if ( $new_master_handler->is_read_only()) {

        print "ok.n";

      }

      else {

        die "Failed!n";

      }

      $new_master_handler->disconnect();

      # Connecting to the orig master, die ifany database error happens

      my $orig_master_handler = newMHA::DBHelper();

     $orig_master_handler->connect($orig_master_ip, $orig_master_port,

        $orig_master_user,$orig_master_password, 1 );

      ## Drop application user so that nobodycan connect. Disabling per-session binlog beforehand

      $orig_master_handler->disable_log_bin_local();

      print current_time_us() . " Drppingapp user on the orig master..n";

     #FIXME_xxx_drop_app_user($orig_master_handler);

      ## Waiting for N * 100 milliseconds sothat current connections can exit

      my $time_until_read_only = 15;

      $_tstart = [gettimeofday];

      my @threads = get_threads_util($orig_master_handler->{dbh},

       $orig_master_handler->{connection_id} );

      while ( $time_until_read_only > 0&& $#threads >= 0 ) {

        if ( $time_until_read_only % 5 == 0 ) {

          printf

"%s Waiting all running %dthreads are disconnected.. (max %d milliseconds)n",

            current_time_us(), $#threads + 1,$time_until_read_only * 100;

          if ( $#threads < 5 ) {

           printData::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump ."n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_read_only–;

        @threads = get_threads_util($orig_master_handler->{dbh},

         $orig_master_handler->{connection_id} );

      }

      ## Setting read_only=1 on the currentmaster so that nobody(except SUPER) can write

      print current_time_us() . " Setread_only=1 on the orig master.. ";

     $orig_master_handler->enable_read_only();

      if ($orig_master_handler->is_read_only() ) {

        print "ok.n";

      }

      else {

        die "Failed!n";

      }

      ## Waiting for M * 100 milliseconds sothat current update queries can complete

      my $time_until_kill_threads = 5;

      @threads = get_threads_util($orig_master_handler->{dbh},

       $orig_master_handler->{connection_id} );

      while ( $time_until_kill_threads > 0&& $#threads >= 0 ) {

        if ( $time_until_kill_threads % 5 == 0) {

          printf

"%s Waiting all running %dqueries are disconnected.. (max %d milliseconds)n",

            current_time_us(), $#threads + 1,$time_until_kill_threads * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_])->Indent(0)->Terse(1)->Dump . "n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_kill_threads–;

        @threads = get_threads_util($orig_master_handler->{dbh},

         $orig_master_handler->{connection_id} );

      }

      ## Terminating all threads

      print current_time_us() . " Killingall application threads..n";

     $orig_master_handler->kill_threads(@threads)if ( $#threads >= 0 );

      print current_time_us() . "done.n";

     $orig_master_handler->enable_log_bin_local();

      $orig_master_handler->disconnect();

      ## After finishing the script, MHAexecutes FLUSH TABLES WITH READ LOCK

      eval {

      `ssh -p$orig_master_ssh_port$orig_master_ssh_user@$orig_master_host " $ssh_stop_vip "`;

        };

        if ($@) {

            warn $@;

        }

      $exit_code = 0;

   };

   if ($@) {

      warn "Got Error: $@n";

      exit $exit_code;

   }

   exit $exit_code;

 }

 elsif ( $command eq "start" ) {

   ## Activating master ip on the new master

   # 1. Create app user with write privileges

   # 2. Moving backup script if needed

   # 3. Register new master's ip to the catalog database

# We don't return error eventhough activating updatable accounts/ip failed so that we don't interruptslaves' recovery.

# If exit code is 0 or 10, MHAdoes not abort

   my $exit_code = 10;

   eval {

      my $new_master_handler = newMHA::DBHelper();

      # args: hostname, port, user, password,raise_error_or_not

      $new_master_handler->connect($new_master_ip, $new_master_port,

        $new_master_user, $new_master_password,1 );

      ## Set read_only=0 on the new master

     $new_master_handler->disable_log_bin_local();

      print current_time_us() . " Setread_only=0 on the new master.n";

      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master

      print current_time_us() . " Creatingapp user on the new master..n";

     #FIXME_xxx_create_app_user($new_master_handler);

     $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

      ## Update master ip on the catalogdatabase, etc

      `ssh -p$new_master_ssh_port$new_master_ssh_user@$new_master_host " $ssh_start_vip "`;

      $exit_code = 0;

   };

   if ($@) {

      warn "Got Error: $@n";

     exit $exit_code;

   }

   exit $exit_code;

 }

 elsif ( $command eq "status" ) {

   # do nothing

   exit 0;

 }

 else {

   &usage();

   exit 1;

 }

}

sub usage {

 print

"Usage:master_ip_online_change –command=start|stop|status –orig_master_host=host–orig_master_ip=ip –orig_master_port=port –new_master_host=host–new_master_ip=ip –new_master_port=portn";

 die;

}

在node94上檢查MHA 的SSH是否配置無誤:

    masterha_check_ssh–conf=/etc/masterha/app1.cnf

如能出現"All SSH connection tests passedsuccessfully." 說明配置沒問題

在node94上檢查MHA 的主從複製是否配置無誤:

    masterha_check_repl–conf=/etc/masterha/app1.cnf

如提示"MySQL Replication Health is OK"說明配置沒問題

在node94上前台啟動MHA:

    masterha_manager–conf=/etc/masterha/app1.cnf –ignore_last_failover  在前台啟動監控

模擬node93master宕機、觀察master的自動切換:

將node93的mysql服務停掉,可以發現此時node94上開啟的masterha_manager進程自動退出了,到其他節點去查看,可以發現主從切換了。

然後啟動node93的mysql,再次上線就不會自動變成master

【!!注意:直接將node93上線的話,集群中就出現了2個主節點,腦裂,masterha_manger也無法啟動】,我們需要先手工將其改為從節點,操作如下:

在node93上,執行:

> change master to

    master_host='10.1.20.94',

    master_user='rpl',

    master_password='rpl',

    master_log_file='mysql-bin.000003',

    master_log_pos=1881;     # 這裡的位置,需要看下node94的/data/masterha/app1目錄下的manager.log裏面的內容找到具體的binlog位置。

> start slave;

> show slave statusG

將node93 重新加入集群後,我們在node94 manager節點上再次執行 masterha_manager–conf=/etc/masterha/app1.cnf –ignore_last_failover  發現啟動不退出了。(驗證過程中不要關閉這個窗口)

檢查下當前主從的配置:

    node94另外開一個xshell窗口,可以執行 masterha_check_repl–conf=/etc/masterha/app1.cnf

可以看出主節點、從節點的發生了變化:

查看是否masterha啟動:

另外開一個xshell窗口,可以執行 masterha_check_status–conf=/etc/masterha/app1.cnf

如果需要停止masterha的話,不要用stop或者kill,要用下面的命令:

    masterha_stop–conf=/etc/masterha/app1.cnf

手動切換主從的方法:

    masterha_master_switch-h   查看幫助信息

    masterha_master_switch–conf=/etc/masterha/app1.cnf –master_state=alive –new_master_host=10.1.20.93–new_master_port=3306 –orig_master_is_new_slave –running_updates_limit=10000

手工切換的時候需要注意2點:

1、執行手工切換的時候,需要先關掉老的master和即將提升為master的主機的 event scheduler,否則無法切換(set global event_scheduler = OFF;)

2、執行手工切換的時候,需要先關閉MHA的監控 masterha_stop–conf=/etc/masterha/app1.cnf )

3、執行手工切換腳本的時候,它會自動在原先的master上執行FLUSH TABLES WITH READ LOCK; 等切換完成後,再UNLOCK TABLES釋放掉這個原有master的鎖。

發送郵件的腳本,需要先安裝插件:

    yuminstall perl-Mail-Sender

發送失敗的話,可以查看/tmp/monitormail.log 找找失敗的原因。

MHA異常的話:

可以查看日誌路徑:/data/masterha/app1/

masterha_manager 還有幾個比較有用的啟動參數:

    –remove_dead_master_conf    該參數代表當發生主從切換後,老的主庫的ip將會從配置文件中移除。

    –manger_log    日誌存放位置,想規範化管理日誌可以加上

    –ignore_last_failover    該參數代表忽略上次MHA觸發切換產生的文件,默認情況下,MHA發生切換後會在日誌目錄,也就是上面我設置的/data產生app1.failover.complete文件,下次再次切換的時候如果發現該目錄下存在該文件將不允許觸發切換,除非在第一次切換後收到刪除該文件,在缺省情況下,如果MHA檢測到連續發生宕機,且兩次宕機間隔不足8小時的話,則不會進行Failover,之所以這樣限制是為了避免ping-pong效應。【如果我們需要強制切換的話,需要先移除這個文件app1.failover.complete】